Calibre Turbo


vovcalibremt is a script to support the parallel distributed version of calibre with MTflex. This script is designed to be invoked in N identical instances on all machines used for the run, with the instance 1 becoming the calibre master while all other instances become the calibre taskers.

The remote file tells the master calibre how to launch the jobs. Here is an example of a "remote file" which is used to specify to the "calibre master" the configuration to be used to the parallel run. Note the 'MANUAL' keyword which means that 'calibre' leaves to Accelerator the process of starting the remote taskers.
LOCAL HOST DIR /some/directory

In order to get N identical invocations of the same vovcalibremt command, we can use vovparallel clone. The full invocation of the command therefore is:

% nc run -dp 16 -dpres mt_master,mt_tasker -dpwait 4m  vovparallel clone vovcalibremt [CALIBRE_OPTIONS]
which could be simplified by the use of a job class like the following :
# This is an example job class file 'calmt.tcl'
set classDescription        "Calibre MTflex jobs"
set VOV_JOB_DESC(resources) "License:calibre"
set VOV_JOB_DESC(env)       "BASE+RTSIM"
set VOV_JOB_DESC(mpi,resources)    "mt_master,mt_tasker"; # Deprecated
set VOV_JOB_DESC(mpi,wait)   240 ;                       # Deprecated
set VOV_JOB_DESC(dp,resources)    "mt_master,mt_tasker"; 
set VOV_JOB_DESC(dp,wait)   240
set VOV_JOB_DESC(wrapper)  "vw vovparallel clone"
set VOV_JOB_DESC(priority,sched)  9

proc initJobClass {} {
  # define resource requiremets for the master (big machine AND a license)
  vtk_resourcemap_set mt_master  -map "linux RAM/1000 CPUS/1 License:calibre#1"  -max unlimited

  # define resource requirements for the tasker.
  vtk_resourcemap_set mt_slave  -map "unix RAM/100 CPUS/1"                      -max unlimited

The resulting submission line would look like this:

% nc run -C calmt -dp 16  vovcalibremt [CALIBRE_OPTIONS]