Schedule Jobs
The scheduling process controls the order of job execution.
- job submission
- job termination
- increased availability in a resource map
- expiration of a reservation
- and many others
The jobs that are scheduled to be run, the queued jobs, are organized into buckets. Each bucket contains jobs that have identical scheduling parameters.
The buckets are assigned a rank based on the FairShare statistics. Starting from the buckets of rank 0 (zero), the scheduler attempts to dispatch the top job in the bucket to the best available tasker. Then the scheduler looks at buckets with rank 1 and so on.
Exceptions to FairShare Scheduling
There are a few ways to bypass the FairShare scheduling: manual dispatch of jobs or Job Cohorts.
Seamless Transition to a Cycle-based Scheduler
The scheduler typically executes in a few milliseconds. The scheduler effort required in each cycle increases with the number of buckets, the number of taskers, and the complexity of the resource expressions. It is always possible to overload the scheduler, meaning that the scheduler requires a large fraction of the total CPU time used by vovserver.
The vovserver parameter to control scheduler cycle behavior is called schedSkip, and is expressed in seconds. If the effort required to run the scheduler exceeds the schedSkip threshold, action is taken to reduce the duty cycle allocated to scheduling. To reduce the duty cycle, the vovserver starts skipping scheduler calls, which frees up computing power to be used to service other requests such as listings and job terminations. The result is that the scheduler functionality is effectively bypassed, regardless of bucket priority, until the target "scheduler duty cycle" percentage is attained.The ratio of the computing power allocated to scheduling, which we call the "scheduler duty cycle", is controlled by the parameter schedMaxEffort, an integer that represents the percentage of time we want to allocate to the scheduler.
The default value of schedSkip is 0.1 seconds, while the default duty cycle reserved for scheduling is set to 20%, represented by schedMaxEffort value of 20.
Other Parameters that Control the Scheduler
- sched.maxpostponedjobs
- Controls the exit from the scheduler when it is hard to dispatch jobs to
taskers, i.e. when the scheduler has to postpone
many jobs because it cannot find suitable taskers for
them. Normally this parameter is set to be much larger than the maximum
number of buckets in the system. Our default value is 10,000; this value
can be decreased if you have very homogeneous farms where all taskers are identical.Note: This parameter will disappear in future implementations of the scheduler.
- fairshare.maxjobsperbucket
- Controls how many jobs are allowed to be dispatched from the same
bucket. The default value for this is 20 jobs. After 20 jobs have been
dispatched from a bucket, the scheduler moves to the next bucket.
Smaller values give a more accurate FairShare
accounting. Larger values give a faster dispatch. Note: This parameter will disappear in future implementations of the scheduler.
- fairshare.maxjobsperloop
- Controls how many jobs can be dispatched in a single scheduler loop
(this includes the scanning of possibly all the buckets). The default
value is 20 jobs per loop (note: this could mean that all 20 jobs are
from the same bucket). Smaller values give a more accurate FairShare accounting. Larger values give a faster
dispatch. Note: This parameter will disappear in future implementations of the scheduler.
% vovsh -x "vtk_server_config schedSkip 0.3"
% vovsh -x "vtk_server_config schedMaxEffort 40"
% vovsh -x "vtk_server_config sched.maxpostponedjobs 10000"
% vovsh -x "vtk_server_config fairshare.maxjobsperloop 20"
% vovsh -x "vtk_server_config fairshare.maxjobsperbucket 20"
# This is a fragment of policy.tcl
set config(schedSkip) 0.3
set config(schedMaxEffort) 40
set config(sched.maxpostponedjobs) 10000
set config(fairshare.maxjobsperloop) 20
set config(fairshare.maxjobsperbucket) 40