Reconciliation Daemon Configuration
Working directory | vnc.swd/vovreconciled |
Config file | vnc.swd/vovreconciled/config.tcl |
The daemon vovreconciled periodically checks all running jobs and looks for resources that are either "Requested/Not Used" or "Not Requested/Used". When the daemon is reasonably sure about the resource mismatch, it will reconcile the grabbed resources list for the running jobs by calling vtk_resourcemap_change_grab.
vovreconciled: Usage Message
DESCRIPTION:
vovreconciled is a daemon that detects "requested/not_used"
resources for running jobs and removes them from the
"grabbed resources" list after a certain amount of time,
called "RevocationDelay"
The RevocationDelay is set to the smallest value
found in the following places:
1. The property AGGRESSIVE_SCHEDULING_DELAY
(old) attached to the job class object, if defined
2. The property REVOKE_DELAY
(new) attached to the job class object, if defined
3. The property REVOKE_DELAY
attached to the resourceMap, if defined
4. The value of RESD(revokeDelay), if defined.
NO revocation is performed if any of the following are true
1. If RevocationDelay < 1
2. If RevocationDelay > 10000000
(or 115d17h)
3. If the resource is not derived from an external license.
4. If the resource type is not "License" or a legal member
of License
5. If the number of revocations for a license on a job >
$RESD(maxRevokes)=50
6. If the CHANGEGRAB property exceeds RESD(maxPropLength)
7. The job is younger than the RESD(revokeDelay)
The config.tcl file must exist but it can be empty.
The config file allows the user to set some additional options
RESD(maxRevokes) N N is the maximum number of times a license on a
job can be revoked. Default is 50
To see the number of times a specific license has
been revoked for a given job, view the
REVCNT_<license> property that will exist on the
job, where <license> is the name of the specific
license of interest.
RESD(maxPropLength) N N is the number of characters the CHANGEGRAB
property can be. Default 130000
RESD(emailSkips) N 1 enables/0 disables emailing the job owner and
optionally admins that a license could have been
revoked but was not, because the maximum number
of revoke was reached or the CHANGEGRAB property
is too long. Default 1
RESD(adminEmails) S A comma-separated string of userId's that are sent
emails on skips. Default ""
RESD(revokeDelay) T number of seconds a job must be running before it
can be considered to have a license revoked.
Default 10000000 seconds or 115d17h
RESD(loopTime) T How often to run the check on all jobs.
Default 30 seconds
RESD(typeList) S A space separated list of license types that will
be handled by vovreconciled. Default is {License}.
The types Limit, Policy, User, Group and Priority
are not supported and will be ignored. The type
License will be added if not specified.
OPTIONS:
-v -- Increase verbosity.
-h -- Show this help.
-loop <TIMESPEC> -- Default 30s
-inert -- Run in inert mode where nothing changes
for the job.
EXAMPLES:
% vovreconciled
% vovreconciled -h
% vovreconciled -loop 2m
% vovreconciled -v
vovreconciled Operations
This daemon, if activated, runs continuously and checks all running jobs every 30
seconds. It looks at running jobs whose age is greater than the
RESD(revokeDelay)
or from the most recent resumption. If one of
such jobs has an RNU resource (Requested but Not Used) for longer than a certain
reconciliation time (Treconcile), then the job is flagged for reconciliation. If the
condition persists for 3 consecutive cycles, then the resource is removed from the
list of grabbed resources for the job.
- The value of the property
REVOKE_DELAY
attached to the resource map (a TIMESPEC) - The value of the property
REVOKE_DELAY
attached to the jobclass (a TIMESPEC) - The value of
RESD(revokeDelay)
in config.tcl
Later on, if a job is found to use a resource that was previously reconciled away, that resource is restored to the job.
Override Delays
- The value of the property
REVOKE_DELAY
in the jobclass - The value of the property
REVOKE_DELAY
in the resource map - The global variable
RESD(revokeDelay)
####
#### DEFAULT IMPLEMENTATION
####
proc VovGetRevokeDelay { jobClass res displayMessage } {
global RESD
set revokeDelayOld [VovJobClassGetProperty $jobClass AGGRESSIVE_SCHEDULING_DELAY 10000000]
set revokeDelayNew [VovJobClassGetProperty $jobClass REVOKE_DELAY 10000000]
set revokeDelayResMap [VovResMapGetProperty $res REVOKE_DELAY 10000000]
set revokeDelay [FindLeastDelay $revokeDelayOld $revokeDelayNew $revokeDelayResMap $RESD(revokeDelay)]
if { $displayMessage > 0 } {
set msg " FindLeastDelay\n"
append msg "\tAggressiveClass: $revokeDelayOld\n"
append msg "\tREVOKE_DELAY in class: $revokeDelayNew (jobclass=$jobClass)\n"
append msg "\tREVOKE_DELAY in ResMap: $revokeDelayResMap (resource=$res)\n"
append msg "\tGlobal: $RESD(revokeDelay)\n"
append msg "\tResult revokeDelay: $revokeDelay"
VovMessage $msg 5
}
return $revokeDelay
}
####
#### EXAMPLE OVERRIDE (to be implemented in vovreconciled/config.tcl
####
proc VovGetRevokeDelay { jobClass res displayMessage } {
global RESD
set revokeDelayClass [VovJobClassGetProperty $jobClass REVOKE_DELAY 10000000]
if { $revokeDelayClass != 1000000 } { return $revokeDelayClass }
set revokeDelayResMap [VovResMapGetProperty $res REVOKE_DELAY 10000000]
set revokeDelay [FindLeastDelay $revokeDelayResMap $RESD(revokeDelay)]
return $revokeDelay
}