Interfaces to Other Batch Processing Systems

You can find examples of interfaces in the directory $VOVDIR/etc/tasker_scripts. The file $VOVDIR/etc/tasker_scripts/taskerLSF.tcl implements the interface to LSF.

Use Implemented Interface to SGE

You can use the -I option of vovtasker to start a BPS agent. For example, FlowTracer implements the interface to SGE in $VOVDIR/etc/tasker_scripts/taskerSGE.tcl. You can start the agent to SGE like this:
% vovtasker -I $VOVDIR/etc/tasker_scripts/taskerSGE.tcl -T 300 -r "dc_shell_license dracula_license"

You can use other options of vovtasker to specify other attributes of your BPS agent. Among all the attributes, the default value of "capacity" for BPS agent is set to 100 (while the default for a normal tasker is equal to the number of tasker machine's CPUs). You can always overwrite this default by -T option. Usually you want to have a large capacity, because you want to give the BPS the ability to schedule jobs in the most effective manner. Capacities of 200 or 300 are typical.

Study and Write Your Own Implementation to Other BPS

The BPS agent is implemented in a BPS agent file, which is a Tcl script that implements the procedures described in the table below.

To implement the BPS agent for your BPS system, for example, LSF, you just need to implement the procedures in the following table. You are encouraged to use the provided implemented interface to SGE as example.
Procedure
Description
taskerStartJob
This procedure submits a job to the BPS. All information about the job is in the global array jobDesc. It should return a reference id, i.e. the id of the job in the context of the BPS.
taskerStopJob
This procedure is called when the tasker wants to stop a running or queued job.
taskerCheckJob
This procedure is called when the tasker wants to check the status of a job. It returns one of the following values: LOST DONE FAILED RUNNING QUEUED.
taskerResumeJob
This procedure is called when the tasker wants to resumed a suspended job.
taskerSuspendJob
This procedure is called when the tasker wants to suspend a job.
taskerJobEnded
This procedure is called when the tasker receives a notification from the server that a job has ended. This allows the tasker to update its state promptly instead of waiting for another timeout.
taskerMapResources
A procedure to map the resources required by the job in the context of the local project and the resources required by the job in the context of the BPS.
taskerCleanup
This procedure is called when the indirect tasker exits. It should be used to cleanup the garbage that may have been created by the tasker.

taskerStartJob

The procedure tasker Start takes a single argument, the jobId. The rest of the job information is available in the array jobDesc, as described in the following table:
Variable Name
Meaning
jobDesc(command)
The complete command line.
jobDesc(env)
The environment label for the job.
jobDesc(id)
The job Id.
jobDesc(priority)
The VOV priority level.
jobDesc(resources)
The resource list for the job.
jobDesc(user)
The user that owns the job.
jobDesc(xdur)
The expected duration.
The procedure returns the reference ID used by the BPS. Example:
proc taskerStartJob { jobId } {
 global jobDesc env
 # Generate a label from the command by eliminating 
 # all non alphanumeric characters.
 set label $jobDesc(command) 
 regsub -all {[^a-zA-Z0-9_]+} $label "_" label
 set label [string range $label 0 7]
 set submitInfo [exec qsub -V -v VOV_ENV=BASE -j y -N $label  $env(VOVDIR)/scripts/vovfire $jobId]
 set refId  [lindex $submitInfo 2]
 return $refId
}

taskerStopJob

This procedure is called when the tasker wants to stop a job. The procedure takes two arguments: the VovId of the job and the referenceId returned by taskerStartJob.

Example:
proc taskerStopJob { jobId refId } {
    # Stop a SGE job.
    exec qdel $refId
}

taskerCheckJob

This procedure takes two arguments: the VovId of the job and the referenceId returned by taskerStart. It is called when the tasker wants to find out the status of a job. The procedure is expected to return one of the following values:
Return Value
Meaning
LOST
The job is no longer in the BPS. It is generally assumed that the job is done.
DONE
The job is done.
FAILED
The BPS believes that the job has failed.
RUNNING
The job is currently executing.
QUEUED
The job is in the BPS queue.
Example:
proc taskerCheckJob { jobId refId } {
    # Check status of a SGE job.
    set status [ParseOutputOf [exec qstat] $refId]
    if { $status == "RUNNING" } {
        vtk_tasker_job_started $jobId [GetStartTime $refId]
    }
    return $status
}
If the job is running, it is the responsibility of this routine to inform the tasker of the exact time the job was started by invoking:
vtk_tasker_job_started $jobId $timespec

taskerSuspendJob, taskerResumeJob

These procedures are used to suspend and resume a job. These procedures also take two parameters: the jobId and the referenceId.

taskerJobEnded

This procedure is called when the tasker receives a notification from the server that a job has ended. This allows the tasker to update its state promptly instead of waiting for another timeout.

taskerCleanup

This procedure is called when the indirect tasker exits. It should be used to cleanup the garbage that may have been created by the tasker.