CGROUPS for Jobs
A cgroup is a control group, used as a system for resource management on Linux.
A cgroup can be used to limit, throttle and account for resource usage per control group. Each resource interface is provided by a controller. Support for cgroup v2 is now enabled. For example, cgroups can be used to isolate core workloads from background resource needs. It prevents one workload from overpowering other workloads. On Linux taskers, a job can be requested to run in one or more cgroups.
cgroup v2
Below is a list of the fundamental differences between cgroup V1 and cgroup v2:
- Unified hierarchy - resources apply to cgroups now
- Granularity at TGID (PID), not TID level
- Focus on simplicity/clarity over ultimate flexibility
Other improvements are shown in the table below.
Description | v1 | v2 |
---|---|---|
Tracking on non-immediate/multi-source charges | No tracking of non-immediate charges Charged to root cgroup, essentially unlimited |
Page cache writebacks and network are charged to the responsible
cgroup Can be considered as part of cgroup limits |
Communication with backing subsystems | Most actions for non-share based resources reacted crudely to
hitting thresholds For example, in the memory cgroup, the only option was to OOM kill or freeze |
Many cgroup controllers negotiate with subsystems before real
problems occur Subsystems can take remediative action (eg. direct reclaim withmemory.high) Easier to deal with temporary spikes in a resource's usage |
Saner notifications | One clone() per event for cgroup release,
expensive
|
inotify support everywhere
One process to monitor everything, if you like |
Utility controllers make sense now | Utility controllers have their own hierarchies We usually want to use processes from another hierarchy As such, we end up manually synchronising |
We have a single unified hierarchy, so no sync needed |
Consistency between controllers | Inconsistencies in controller APIs Some controllers don't inherit values |
Better consistency between controllers |
Unified limits | Some limitations could not be fixed due to backwards
compatibility
|
Less iterative, more designed up front We now have universal
thresholds (eg |
Syntax
The syntax is similar to requesting any other resource, with the resource name consisting of the prefix CGROUP: followed by the path to the cgroup on the filesystem relative to the root of the cgroup hierarchy.
nc run -r CGROUP:/cpuset/my_cgroup1 -r CGROUP:/memory/my_cgroup2 -- sleep 120
If multiple conflicting cgroups are assigned, such as two cgroups under the /memory hierarchy, the cgroup that is specified last is the cgroup that is assigned to the process.
The special resource CGROUP:RAM can be used to limit the memory usage of a job within a cgroup. In the following example, the job is assigned to a default cgroup that is limited to 2000 megabytes of RAM. As only one job is placed in each default cgroup, the RAM usage can be limited on a per job. The path to this default cgroup is:
<path to cgroup root directory>/memory/<queue name>_<tasker name>_<job slot number>
nc run -r CGROUP:RAM -r RAM/2000 -- sleep 120
There is a special case with RAM: specifying CGROUP:RAM and RAM/200 would result in a job being placed in /cgroup/memory/vncCG_buffalo_3 and /cgroup/memory/vncCG_buffalo_3/memory.limit_in_bytes being set to 209715200.
To see tasker resource for cgroups use, nc hosts -r and look for CGROUP: entries. If they are not present, ensure that they are set up on the taskers with LSCGROUPS and check the tasker logs for errors.
Enable cgroups
Many Centos, SLES or Ubuntu systems do not install with cgroups available by default. However, Use the steps below to configure Linux for cgroups.
% sudo service cgconfig start
% sudo chkconfig cgconfig on