Cgroups Guide
Cgroups Overview
For a comprehensive description of Linux Control Groups (cgroups) see the cgroups documentation at kernel.org. Detailed knowledge of cgroups is not required to use cgroups in SLURM, but a basic understanding of the following features of cgroups is helpful:- Cgroup - a container for a set of processes subject to common controls or monitoring, implemented as a directory and a set of files (state objects) in the cgroup virtual filesystem.
- Subsystem - a module, typically a resource controller, that applies a set of parameters to the cgroups in a hierarchy.
- Hierarchy - a set of cgroups organized in a tree structure, with one or more associated subsystems.
- State Objects - pseudofiles that represent the state of a cgroup or
apply controls to a cgroup:
- tasks - identifies the processes (PIDs) in the cgroup.
- release_agent - specifies the location of the script or program to be called when the cgroup becomes empty.
- notify_on_release - controls whether the release_agent is called for the cgroup.
- additional state objects specific to each subsystem.
Use of Cgroups in SLURM
SLURM provides cgroup versions of a number of plugins.- proctrack (process tracking)
- task (task management)
- jobacct_gather (job accounting statistics) The cgroup plugins can provide a number of benefits over the other more standard plugins, as described below.
SLURM Cgroups Configuration Overview
There are several sets of configuration options for SLURM cgroups:- slurm.conf provides options to enable the cgroup plugins. Each plugin may be enabled or disabled independently of the others.
- cgroup.conf provides general options that are common to all cgroup plugins, plus additional options that apply only to specific plugins.
- Additional configuration is required to enable automatic removal of SLURM cgroups when they are no longer in use. See Cleanup of SLURM Cgroups below for details.
Currently Available Cgroup Plugins
proctrack/cgroup plugin
The proctrack/cgroup plugin is an alternative to other proctrack plugins such as proctrack/linux for process tracking and suspend/resume capability. proctrack/cgroup uses the freezer subsystem which is more reliable for tracking and control than proctrack/linux.To enable this plugin, configure the following option in slurm.conf:
ProctrackType=proctrack/cgroupThere are no specific options for this plugin in cgroup.conf, but the general options apply. See the cgroup.conf man page for details.
task/cgroup plugin
The task/cgroup plugin is an alternative other task plugins such as task/affinity plugin for task management. task/cgroup provides the following features:- The ability to confine jobs and steps to their allocated cpuset.
- The ability to bind tasks to sockets, cores and threads within their step's allocated cpuset on a node.
- Supports block and cyclic distribution of allocated cpus to tasks for binding.
- The ability to confine jobs and steps to specific memory resources.
- The ability to confine jobs to their allocated set of generic resources (gres devices).
To enable this plugin, configure the following option in slurm.conf:
TaskPlugin=task/cgroupThere are many specific options for this plugin in cgroup.conf. The general options also apply. See the cgroup.conf man page for details.
jobacct_gather/cgroup plugin
At present, jobacct_gather/cgroup should be considered experimental.The jobacct_gather/cgroup plugin is an alternative to the jobacct_gather/linux plugin for the collection of accounting statistics for jobs, steps and tasks. The cgroup plugin may provide improved performance over jobacct_gather/linux. jobacct_gather/cgroup uses the cpuacct and memory subsystems. Note: the cpu and memory statistics collected by this plugin do not represent the same resources as the cpu and memory statistics collected by the jobacct_gather/linux plugin (sourced from /proc stat).
To enable this plugin, configure the following option in slurm.conf:
JobacctGatherType=jobacct_gather/cgroupThere are no specific options for this plugin in cgroup.conf, but the general options apply. See the cgroup.conf man page for details.
Organization of SLURM Cgroups
SLURM cgroups are organized as follows. A base directory (mount point) is created at /cgroup, or as configured by the CgroupMountpoint option in cgroup.conf. All cgroup hierarchies are created below this base directory. A separate hierarchy is created for each cgroup subsystem in use. The name of the root cgroup in each hierarchy is the subsystem name. A cgroup named slurm is created below the root cgroup in each hierarchy. Below each slurm cgroup, cgroups for SLURM users, jobs, steps and tasks are created dynamically as needed. The names of these cgroups consist of a prefix identifying the SLURM entity (user, job, step or task), followed by the relevant numeric id. The following example shows the path of the task cgroup in the cpuset hierarchy for taskid#2 of stepid#0 of jobid#123 for userid#100, using the default base directory (/cgroup):/cgroup/cpuset/slurm/uid_100/job_123/step_0/task_2Note that this structure applies to a specific compute node. Jobs that use more than one node will have a cgroup structure on each node.
Cleanup of SLURM Cgroups
Linux provides a mechanism for the automatic removal of a cgroup when its state changes from non-empty to empty. A cgroup is empty when no processes are attached to it and it has no child cgroups. The SLURM cgroups implementation allows this mechanism to be used to automatically remove the relevant SLURM cgroups when tasks, steps and jobs terminate. To enable this automatic removal feature, follow these steps:- If desired, configure the location of the SLURM Cgroup release agent directory. This is done using the CgroupReleaseAgentDir option in cgroup.conf. The default location is /etc/slurm/cgroup.
[sulu] (slurm) etc> cat cgroup.conf | grep CgroupReleaseAgentDir CgroupReleaseAgentDir="/etc/slurm/cgroup"
[sulu] (slurm) etc> ls -al /etc/slurm/cgroup total 12 drwxr-xr-x 2 root root 4096 2010-04-23 14:55 . drwxr-xr-x 4 root root 4096 2010-07-22 14:48 .. -rwxrwxrwx 1 root root 234 2010-04-23 14:52 release_common lrwxrwxrwx 1 root root 32 2010-04-23 11:04 release_cpuset -> /etc/slurm/cgroup/release_common lrwxrwxrwx 1 root root 32 2010-04-23 11:03 release_freezer -> /etc/slurm/cgroup/release_common
Last modified 6 June 2012