Reconciliation Daemon Configuration

Summary information for vovreconciled:
Working directory vnc.swd/vovreconciled
Config file vnc.swd/vovreconciled/config.tcl

The daemon vovreconciled periodically checks all running jobs and looks for resources that are either "Requested/Not Used" or "Not Requested/Used". When the daemon is reasonably sure about the resource mismatch, it will reconcile the grabbed resources list for the running jobs by calling vtk_resourcemap_change_grab.


vovreconciled: Usage Message

DESCRIPTION:
    vovreconciled is a daemon that detects "requested/not_used"
    resources for running jobs and removes them from the
    "grabbed resources" list after a certain amount of time,
    called "RevocationDelay"

    The RevocationDelay is set to the smallest value
    found in the following places:

    1. The property AGGRESSIVE_SCHEDULING_DELAY
                           (old) attached to the job class object, if defined
    2. The property REVOKE_DELAY
                           (new) attached to the job class object, if defined
    3. The property REVOKE_DELAY
                           attached to the resourceMap, if defined
    4. The value of RESD(revokeDelay), if defined.


    NO revocation is performed if any of the following are true
      1. If RevocationDelay  < 1
      2. If RevocationDelay  > 10000000
         (or 115d17h)
      3. If the resource is not derived from an external license.
      4. If the resource type is not "License" or a legal member
         of License
      5. If the number of revocations for a license on a job  >
         $RESD(maxRevokes)=50
      6. If the CHANGEGRAB property exceeds RESD(maxPropLength)
      7. The job is younger than the RESD(revokeDelay)

    The config.tcl file must exist but it can be empty.
    The config file allows the user to set some additional options

    RESD(maxRevokes)    N  N is the maximum number of times a license on a
                           job can be revoked.  Default is 50
                           To see the number of times a specific license has
                           been revoked for a given job, view the
                           REVCNT_<license> property that will exist on the
                           job, where <license> is the name of the specific
                           license of interest.
    RESD(maxPropLength) N  N is the number of characters the CHANGEGRAB
                           property can be.  Default 130000
    RESD(emailSkips)    N  1 enables/0 disables emailing the job owner and
                           optionally admins that a license could have been
                           revoked but was not, because the maximum number
                           of revoke was reached or the CHANGEGRAB property
                           is too long.  Default 1
    RESD(adminEmails)   S  A comma-separated string of userId's that are sent
                           emails on skips. Default ""
    RESD(revokeDelay)   T  number of seconds a job must be running before it
                           can be considered to have a license revoked.
                           Default 10000000 seconds or 115d17h
    RESD(loopTime)      T  How often to run the check on all jobs.
                           Default 30 seconds
    RESD(typeList)      S  A space separated list of license types that will
                           be handled by vovreconciled.  Default is {License}.
                           The types Limit, Policy, User, Group and Priority
                           are not supported and will be ignored. The type
                           License will be added if not specified.

OPTIONS:
    -v                    -- Increase verbosity.
    -h                    -- Show this help.
    -loop <TIMESPEC>      -- Default 30s
    -inert                -- Run in inert mode where nothing changes
                             for the job.

EXAMPLES:
    % vovreconciled
    % vovreconciled -h
    % vovreconciled -loop 2m
    % vovreconciled -v

vovreconciled Operations

This daemon, if activated, runs continuously and checks all running jobs every 30 seconds. It looks at running jobs whose age is greater than the RESD(revokeDelay) or from the most recent resumption. If one of such jobs has an RNU resource (Requested but Not Used) for longer than a certain reconciliation time (Treconcile), then the job is flagged for reconciliation. If the condition persists for 3 consecutive cycles, then the resource is removed from the list of grabbed resources for the job.

The reconciliation time Treconcile is computed as the list of:
  • The value of the property REVOKE_DELAY attached to the resource map (a TIMESPEC)
  • The value of the property REVOKE_DELAY attached to the jobclass (a TIMESPEC)
  • The value of RESD(revokeDelay) in config.tcl

Later on, if a job is found to use a resource that was previously reconciled away, that resource is restored to the job.

Override Delays

For each running job, vovreconciled looks at what it can do only after a certain amount of time has elapsed from the start of the job. This amount of time is called REVOKE DELAY and it is defined, by default, as the least of:
  • The value of the property REVOKE_DELAY in the jobclass
  • The value of the property REVOKE_DELAY in the resource map
  • The global variable RESD(revokeDelay)
Some customers may want to change this behavior. A possibility is to override the procedure VovGetRevokeDelay in the file config.tcl. Both the default implementation of this procedure as well as an example for an override are shown below:
####
#### DEFAULT IMPLEMENTATION
####
proc VovGetRevokeDelay { jobClass res displayMessage } {
    global RESD
    set revokeDelayOld    [VovJobClassGetProperty $jobClass AGGRESSIVE_SCHEDULING_DELAY 10000000]
    set revokeDelayNew    [VovJobClassGetProperty $jobClass REVOKE_DELAY                10000000]
    set revokeDelayResMap [VovResMapGetProperty   $res      REVOKE_DELAY                10000000]

    set revokeDelay [FindLeastDelay $revokeDelayOld $revokeDelayNew $revokeDelayResMap $RESD(revokeDelay)]

    if { $displayMessage > 0 } {
        set msg "    FindLeastDelay\n"
        append msg "\tAggressiveClass:        $revokeDelayOld\n"
        append msg "\tREVOKE_DELAY in class:  $revokeDelayNew (jobclass=$jobClass)\n"
        append msg "\tREVOKE_DELAY in ResMap: $revokeDelayResMap (resource=$res)\n"
        append msg "\tGlobal:                 $RESD(revokeDelay)\n"
        append msg "\tResult revokeDelay:     $revokeDelay"
        VovMessage $msg 5
    }

    return $revokeDelay
}
####
#### EXAMPLE OVERRIDE (to be implemented in vovreconciled/config.tcl
####
proc VovGetRevokeDelay { jobClass res displayMessage } {
    global RESD
    set revokeDelayClass    [VovJobClassGetProperty $jobClass REVOKE_DELAY 10000000]
    if { $revokeDelayClass != 1000000 } { return $revokeDelayClass }

    set revokeDelayResMap [VovResMapGetProperty   $res      REVOKE_DELAY 10000000]
    set revokeDelay [FindLeastDelay $revokeDelayResMap $RESD(revokeDelay)]

    return $revokeDelay
}