Autokill of Jobs by vovtasker
If a job has the autokill field set to a positive value, the job will be killed by vovtasker/vovtasker if its duration exceeds the autokill value. The autokill field can be applied to any job of which the status is not Failed or Done.
The check for this condition is done by vovtasker every minute. This frequency is controlled by the "tasker update" value).
- Direct: the tasker sends the job the signals
TERM,HUP,INT,KILL. These can by overridden by
defaultStopSignalCascade
anddefaultStopSignalDelay
that can be set in policy.tcl, etc.If NC_STOP_SIGNALS and/or NC_STOP_SIG_DELAY are set, then these will be used instead. The format of NC_STOP_SIGNALS can be a comma separated list. Each signal name such as "USR1", or the format
":SIGNAL:includerx:excluderx:skiptop"
that is,nc run -D autokill 10s -P NC_STOP_SIGNALS=USR1::python:1,TERM,KILL -P NC_STOP_SIGNAL_DELAY=1
send USR1 excluding any Python processes that are part of the job. Then send TERM and then KILL with a delay of 1s between signals.
VOV_STOP_SIGNALS and VOV_STOP_SIGNALS_DELAY work similarly to NC_STOP_SIGNALSand NC_STOP_SIG_DELAY
- NC STOP: the tasker calls nc stop JOBID; this only works for tasker that are running within Accelerator; this method honors the values of NC_STOP_SIGNALS and NC_STOP_SIG_DELAY.
- VOVSTOP : the tasker calls
vovstop -f
JOBID to kill the job.
The method used to autokill the jobs can be controlled on each tasker, meaning that all jobs on that tasker, if they need to be autokilled, will be killed with the same method.
% vovsh -x 'vtk_tasker_config TASKERNAME_OR_ID autokillmethod VALUE'
VALUE
can be one of the following keywords:- direct
- ncstop
- vovstop
In reality, only the first letter of the keyword is used, i.e. 'd', 'n', 'v'. Anything else maps silently to 'direct'.
% vovsh -x 'vtk_tasker_config lnx0123 autokillmethod ncstop'
% vovsh -x 'vtk_tasker_config lnx0123 autokillmethod n'
% vovsh -x 'vtk_tasker_config 00234567 autokillmethod direct'