实在懒得写了,cgroup转一篇,Cgroup 用法,写得很详细。对cgroup子系统的分析可参考下面两篇。html
cgroup 子系统之 net_cls 和 net_priogit
介绍docker的的过程当中,提到lxc利用cgroup来提供资源的限额和控制,本文主要介绍cgroup的用法和操做命令,主要内容来自github
[1]https://access.redhat.com/sit...sql
[2]https://www.kernel.org/doc/Do...docker
cgroup的功能在于将一台计算机上的资源(CPU,memory, network)进行分片,来防止进程间不利的资源抢占。centos
Terminologyapi
subsystem
和一组树形结构的cgroup
. 和cgroup
不一样,hierarchy
包含的是可管理的subsystem
而非具体参数因而可知,cgroup对资源的管理是一个树形结构,相似进程。网络
相同点 - 分层结构,子进程/cgroup继承父进程/cgroupapp
不一样点 - 进程是一个单根树状结构(pid=0为根),而cgroup总体来看是一个多树的森林结构(hierarchy为根)。
一个典型的hierarchy
挂载目录以下
/cgroup/ ├── blkio <--------------- hierarchy/root cgroup │ ├── blkio.io_merged <--------------- subsystem parameter ... ... │ ├── blkio.weight │ ├── blkio.weight_device │ ├── cgroup.event_control │ ├── cgroup.procs │ ├── lxc <--------------- cgroup │ │ ├── blkio.io_merged <--------------- subsystem parameter │ │ ├── blkio.io_queued ... ... ... │ │ └── tasks <--------------- task list │ ├── notify_on_release │ ├── release_agent │ └── tasks ...
subsystem列表
RHEL/centos支持的subsystem以下
1.一个hierarchy能够有多个 subsystem (mount 的时候hierarchy能够attach多个subsystem)
A single hierarchy can have one or more subsystems attached to it.
eg.
mount -t cgroup -o cpu,cpuset,memory cpu_and_mem /cgroup/cpu_and_mem
2.一个已经被挂载的 subsystem 只能被再次挂载在一个空的 hierarchy 上 (已经mount一个subsystem的hierarchy不能挂载一个已经被其它hierarchy挂载的subsystem)
Any single subsystem (such as cpu) cannot be attached to more than one hierarchy if one of those hierarchies has a different subsystem attached to it already.
3.每一个task只能在一同个hierarchy的惟一一个cgroup里(不能在同一个hierarchy下有超过一个cgroup的tasks里同时有这个进程的pid)
Each time a new hierarchy is created on the systems, all tasks on the system are initially members of the default cgroup of that hierarchy, which is known as the root cgroup. For any single hierarchy you create, each task on the system can be a member of exactly onecgroup in that hierarchy. A single task may be in multiple cgroups, as long as each of those cgroups is in a different hierarchy. As soon as a task becomes a member of a second cgroup in the same hierarchy, it is removed from the first cgroup in that hierarchy. At no time is a task ever in two different cgroups in the same hierarchy.
4.子进程在被fork出时自动继承父进程所在cgroup,可是fork以后就能够按需调整到其余cgroup
Any process (task) on the system which forks itself creates a child task. A child task automatically inherits the cgroup membership of its parent but can be moved to different cgroups as needed. Once forked, the parent and child processes are completely independent.
5.其它
1.挂载subsystem
利用cgconfig服务及其配置文件 /etc/cgconfig.conf
- 服务启动时自动挂载
subsystem = /cgroup/hierarchy;
命令行操做
mount -t cgroup -o subsystems name /cgroup/name
取消挂载
umount /cgroup/name
eg. 挂载 cpuset, cpu, cpuacct, memory 4个subsystem到/cgroup/cpu_and_mem
目录(hierarchy)
mount { cpuset = /cgroup/cpu_and_mem; cpu = /cgroup/cpu_and_mem; cpuacct = /cgroup/cpu_and_mem; memory = /cgroup/cpu_and_mem; }
or
mount -t cgroup -o remount,cpu,cpuset,memory cpu_and_mem /cgroup/cpu_and_mem
2. 新建/删除 cgroup
利用cgconfig服务及其配置文件 /etc/cgconfig.conf
- 服务启动时自动挂载
group <name> { [<permissions>] <controller> { <param name> = <param value>; … } … }
命令行操做
cgcreate -t uid:gid -a uid:gid -g subsystems:path
mkdir /cgroup/hierarchy/name/child_name
cgdelete subsystems:path
(使用 -r 递归删除)rm -rf /cgroup/hierarchy/name/child_name
(cgconfig service not running)3. 权限管理
利用cgconfig服务及其配置文件 /etc/cgconfig.conf
- 服务启动时自动挂载
perm { task { uid = <task user>; gid = <task group>; } admin { uid = <admin name>; gid = <admin group>; } }
chown
eg.
group daemons { cpuset { cpuset.mems = 0; cpuset.cpus = 0; } } group daemons/sql { perm { task { uid = root; gid = sqladmin; } admin { uid = root; gid = root; } } cpuset { cpuset.mems = 0; cpuset.cpus = 0; } }
or
~]$ mkdir -p /cgroup/red/daemons/sql ~]$ chown root:root /cgroup/red/daemons/sql/* ~]$ chown root:sqladmin /cgroup/red/daemons/sql/tasks ~]$ echo 0 > /cgroup/red/daemons/cpuset.mems ~]$ echo 0 > /cgroup/red/daemons/cpuset.cpus ~]$ echo 0 > /cgroup/red/daemons/sql/cpuset.mems ~]$ echo 0 > /cgroup/red/daemons/sql/cpuset.cpus
4. cgroup参数设定
cgset -r parameter=value path_to_cgroup
cgset --copy-from path_to_source_cgroup path_to_target_cgroup
echo value > path_to_cgroup/parameter
eg.
cgset -r cpuset.cpus=0-1 group1 cgset --copy-from group1/ group2/ echo 0-1 > /cgroup/cpuset/group1/cpuset.cpus
5. 添加task
cgclassify -g subsystems:path_to_cgroup pidlist
echo pid > path_to_cgroup/tasks
cgexec -g subsystems:path_to_cgroup command arguments
echo 'CGROUP_DAEMON="subsystem:control_group"' >> /etc/sysconfig/
利用cgrulesengd服务初始化,在配置文件/etc/cgrules.conf
中
user<:command> subsystems control_group 其中: +用户user的全部进程的subsystems限制的group为control_group +<:command>是可选项,表示对特定命令实行限制 +user能够用@group表示对特定的 usergroup 而非user +能够用*表示所有 +%表示和前一行的该项相同
eg.
cgclassify -g cpu,memory:group1 1701 1138 echo -e "1701\n1138" |tee -a /cgroup/cpu/group1/tasks /cgroup/memory/group1/tasks cgexec -g cpu:group1 lynx http://www.redhat.com sh -c "echo \$$ > /cgroup/lab1/group1/tasks && lynx http://www.redhat.com"
经过/etc/cgrules.conf 对特定服务限制
maria devices /usergroup/staff maria:ftp devices /usergroup/staff/ftp @student cpu,memory /usergroup/student/ % memory /test2/
6. 其余
cgsnapshot会根据当前cgroup状况生成/etc/cgconfig.conf文件内容
gsnapshot [-s] [-b FILE] [-w FILE] [-f FILE] [controller] -b, --blacklist=FILE Set the blacklist configuration file (default /etc/cgsnapshot_blacklist.conf) -f, --file=FILE Redirect the output to output_file -s, --silent Ignore all warnings -t, --strict Don't show the variables which are not on the whitelist -w, --whitelist=FILE Set the whitelist configuration file (don't used by default)
查看进程在哪一个cgroup
ps -O cgroup 或 cat /proc/<PID>/cgroup
查看subsystem mount状况
cat /proc/cgroups lssubsys -m <subsystems>
lscgroup
查看cgroup参数值
cgget -r parameter list_of_cgroups cgget -g <controllers>:<path>
更多
common
device_types:node_numbers milliseconds
device_types:node_numbers sector_count
CONFIG_DEBUG_BLK_CGROUP=y
)CONFIG_DEBUG_BLK_CGROUP=y
, 单位ns)CONFIG_DEBUG_BLK_CGROUP=y
, 单位ns)CONFIG_DEBUG_BLK_CGROUP=y
) - device_types:node_numbers number
device_types:node_numbers operation number
device_types:node_numbers operation bytes
device_types:node_numbers operation time
device_types:node_numbers operation time
number operation
number operation
Proportional weight division 策略 - 按比例分配block io资源
I/O throttling (Upper limit) 策略 - 设定IO操做上限
device_types:node_numbers bytes_per_second
blkio.throttle.write_bps_device - device_types:node_numbers bytes_per_second
device_types:node_numbers operations_per_second
blkio.throttle.write_iops_device - device_types:node_numbers operations_per_second
device_types:node_numbers operation operations_per_second
blkio.throttle.io_service_bytes - device_types:node_numbers operation bytes_per_second
CFS(Completely Fair Scheduler)策略 - CPU最大资源限制
RT(Real-Time scheduler)策略 - CPU最小资源限制
两者配合使用规定cgroup里的task每cpu.rt_period_us(微秒)必然会执行cpu.rt_runtime_us(微秒)
cpuset.sched_relax_domain_level - 可选 - cpuset.sched_load_balance的策略
设备黑/白名单
memory.stat - 报告cgroup限制状态
<network_interface> <priority>