Clean up Processes Left Behind by Completed Jobs
Some third-party software has a tendency to spawn child processes but do not ensure
that they are cleaned up once the main process ends. This behavior can lead to
overloaded, and in extreme cases, unresponsive hosts. Accelerator
can be configured to enable automatic cleanup of such processes. This functionality
is supported on Linux only.
Note: While automated cleanup is an effective strategy
for combating this problem, the behavior should not be considered as normal, and
it is recommended to report it to the third-party software vendor when it is
encountered.
If enabled, for each job that has ended, the vovtasker will parse
the environment metadata for each process that exists on the host. If the VOV_JOBID
environment variable exists in the environment for a process, and its value matches
that of the job that has ended, the process will be marked as an orphan process that
must be cleaned up. Since child processes inherit their parent's environment, the
vovtasker will be able to identify related child processes
hierarchically. Once all orphan processes have been identified, the vovtasker will send the KILL signal to each one and will print a
corresponding message in the tasker log.
Note: Processes that
create their own environment from scratch, as well as ones that explicitly
remove the VOV_JOBID variable from the environment will not be cleaned up by
this feature.
-
To enable this feature, add the following line to the
$SWD/policy.tcl file:
set config(tasker.childProcessCleanup) 1
-
Once the file has been edited, reread the policy via:
% nc cmd vovproject reread
-
To confirm the feature is enabled:
% nc cmd vovselect param.tasker.childProcessCleanup from server
A value of 0 indicates the feature is disabled (default), whereas a value of 1 indicates the feature is enabled.