Other schedulers

This chapter provides some instructions to use Flux on a cluster using another Batch Scheduler than PBS.

Run Flux in the command line

Most of the requirements to use Flux through a Batch scheduler are the same as running Flux in the command line. You can refer to this documentation: How to Launch Flux via Command Line?. You’ll see how to set all the mandatory environment variables for:
  • Memory management (JVM_MEMORY, MEMSIZC3, MEMSIZN3)
  • Core management (FLUX_NCORES)
and the command line arguments to run Flux (to run a Python file or to run a specific application for example).

We recommend running Flux using a Python script and using non-interactive mode on a cluster. The command line should then looks like this:

# Run Flux
$PATH_TO_FLUX_EXECUTABLE -application $FLUX_APP -runPyInSilentModeAndExit

Batch scheduler-specific settings

Regarding a batch scheduler used for Flux, some additional settings are needed.

First, as the scheduler will provide a node list for the job, the node file has to be defined. The corresponding environment variable is FLUX_NODEFILE. For example, the PBS node file is accessible with the environment variable PBS_NODEFILE, therefore, one needs to set the:

Parametric distribution

Some Flux projects contain parameters that can be solved independently. Flux comes with a feature that allows the distribution of these parameters to speed up this kind of computation. To use that feature, the following environment variables must be defined:
  • FLUX_PARAMETRIC: Indicator for using parametric distribution. Must be set to true
  • FLUX_PARAM_AUTO: Indicator for using an automatic or manual computation of the number of jobs. Must be set to Automatic or Manual
    • In Automatic, Flux will submit a maximum of 127 jobs (+1 for the master) to solve the project. The number of solved parameters per job is automatically computed, and if there are less than 127 parameters, each job will solve 1 parameter.
    • In Manual mode, the environment variable FLUX_PARAM_MAXJOBS must be set to the maximum number of cores you want to allocate for the job. One of these cores will be used for the master, and the other will be jobs submitted by Flux to solve the project. The number of solved parameters per job is automatically computed, and if there are fewer parameters than remaining cores, each job will solve 1 parameter and the exceeding cores won’t be used and will be free for other jobs. On the other hand, the number of cores used by the secondary Flux may be set with the following variable: FLUX_PARAM_NCORES. By default, each secondary Flux will use the same memory configuration as the primary Flux. It is however expected that the secondary Flux need less memory. Therefore, the following environment variable can be used to define a different memory configuration: FLUX_PARAM_MEMORY (in bytes, 0 for dynamic memory).
# Use parametric distribution
export FLUX_PARAM_AUTO=Manual
export FLUX_PARAM_MEMORY=4194304000
To submit jobs for the secondary flux, the job submission command with the name option needs to be defined, using the environment variable SUBMITNAME. For example, with PBS, we use qsub -N:
export SUBMITNAME=qsub -N
Some specific environment variables need to be passed through the job submission command line. Most of the batch schedulers use the -v option, but if needed, this option can be changed using the SUBMITVAR command line:
export SUBMITVAR=-v
Finally, as the different batch schedulers handle the order of the arguments differently, the SUBMIT_OPT_PRE environment variable is used and takes the value 1 or 2:
  • If SUBMIT_OPT_PRE = 1 (as for PBS), the environment variables are defined before the script and the command line will be like this:
    qsub -N <job_name> -v <env_var_list> my_script.sh
  • If SUBMIT_OPT_PRE = 2 (for example for Slurm), the environment variables are defined after the script and the command line will be like this:
    oarsub -N <job_name> my_script.sh -v <env_var_list>