kubernetes调度之资源配额

时间 2019-11-11

标签 kubernetes 调度资源配额繁體版

原文原文链接

系列目录html

当多个用户或者开发团队共享一个有固定节点的的kubernetes集群时,一个团队或者一个用户使用的资源超过他应当使用的资源是须要关注的问题,资源配额是管理员用来解决这个问题的一个工具.node

资源配额,经过ResourceQuota定义,提供了对某一名称空间使用资源的整体约束.它便可以限制这个名称空间下有多少个对象能够被建立,也能够限制对计算机资源使用量的限制(前面说到过,计算机资源包括cpu,内存,磁盘空间等资源)nginx

资源配额经过如下相似方式工做:ubuntu

不一样的团队在不一样的名称空间下工做.当前kubernetes并无强制这样作,彻底是自愿的,可是kubernetes团队计划经过acl受权来达到强制这样作.api
管理员对每个名称空间建立一个ResourceQuota(资源配额)bash
用户在一个名称空间下建立资源(例如pod,service等),配额系统跟踪资源使用量来保证资源的使用不超过ResourceQuota定义的量.服务器
若是对一个资源的建立或者更新违反了资源配额约束,则请求会返回失败,失败的http状态码是403 FORBIDDEN而且有一条消息来解释哪一个约束被违反.app
若是一个名称空间下的计算机资源配额,好比CPU和内存被启用,则用户必须指定相应的资源申请或者限制的值,不然配额系统可能会阻止pod的建立.工具

资源配额在某一名称空间下建立策略示例:

在一个有32G内存,16核cpu的集群,让团队A使用20G内存和10核cpu,让团队B使用10G内存和4核cpu,剩余的2G内存和2核cup预留以备进一步分配测试
限制测试名称空间使用1核1G,让生产名称空间使用剩下的所有资源

当集群的容量小于全部名称空间下配额总和时,将会出现资源竞争,这种状况下kubernetes将会基于先到先分配的原则进行处理

不管是资源竞争或者是资源配额的修改都不会影响已经建立的资源

启用资源配额

不少kubernetes的发行版中资源配额支持默认是开启的,当ResourceQuota做为apiserver的--enable-admission-plugins=的其中一个值时,资源配额被开启.

当某一名称空间包含ResourceQuota对象时资源配额在这个名称空间下生效.

计算机资源配额

你能够限制一个名称空间下能够被申请的计算机资源的总和

kubernetes支持如下资源类型:

Resource Name	Description
cpu	Across all pods in a non-terminal state, the sum of CPU requests cannot exceed this value.
limits.cpu	Across all pods in a non-terminal state, the sum of CPU limits cannot exceed this value.
limits.memory	Across all pods in a non-terminal state, the sum of memory limits cannot exceed this value.
memory	Across all pods in a non-terminal state, the sum of memory requests cannot exceed this value.
requests.cpu	Across all pods in a non-terminal state, the sum of CPU requests cannot exceed this value.
requests.memory	Across all pods in a non-terminal state, the sum of memory requests cannot exceed this value.

扩展资源的资源配额

除了上面提到的,在kubernetes 1.10里,添加了对扩展资源的配额支持

存储资源配额

你能够限制某一名称空间下的存储空间总量的申请

此外,你也能够你也能够根据关联的storage-class来限制存储空间资源的使用

Resource Name	Description
requests.storage	Across all persistent volume claims, the sum of storage requests cannot exceed this value.
persistentvolumeclaims	The total number of persistent volume claims that can exist in the namespace.
.storageclass.storage.k8s.io/requests.storage	Across all persistent volume claims associated with the storage-class-name, the sum of storage requests cannot exceed this value.
.storageclass.storage.k8s.io/persistentvolumeclaims	Across all persistent volume claims associated with the storage-class-name, the total number of persistent volume claims that can exist in the namespace.

例如,一个operator想要想要使黄金和青铜单独申请存储空间,那么这个operator能够像以下同样申请配额:

gold.storageclass.storage.k8s.io/requests.storage: 500Gi
bronze.storageclass.storage.k8s.io/requests.storage: 100Gi

在1.8版本里,对local ephemeral storage配额的的支持被添加到alpha特征里.

Resource Name	Description
requests.ephemeral-storage	Across all pods in the namespace, the sum of local ephemeral storage requests cannot exceed this value.
limits.ephemeral-storage	Across all pods in the namespace, the sum of local ephemeral storage limits cannot exceed this value.

对象数量配额

1.9版本经过如下语法加入了对全部标准名称空间资源类型的配额支持

count/<resource>.<group>

如下是用户可能想要设置对象数量配额的例子:

count/persistentvolumeclaims
count/services
count/secrets
count/configmaps
count/replicationcontrollers
count/deployments.apps
count/replicasets.apps
count/statefulsets.apps
count/jobs.batch
count/cronjobs.batch
count/deployments.extensions

当使用count/*类型资源配额,服务器上存在的资源对象将都被控制.这将有助于防止服务器存储资源被耗尽.好比,若是存储在服务器上的secrets资源对象过大,你可能会想要限制它的数量.过多的secrets可能会致使服务器没法启动!你也可能会限制job的数量以防一些设计拙劣的定时任务会建立过多的job以致使服务被拒绝

如下资源类型的限额是支持的

Resource Name	Description
configmaps	The total number of config maps that can exist in the namespace.
persistentvolumeclaims	The total number of persistent volume claims that can exist in the namespace.
pods	The total number of pods in a non-terminal state that can exist in the namespace. A pod is in a terminal state if .status.phase in (Failed, Succeeded) is true.
replicationcontrollers	The total number of replication controllers that can exist in the namespace.
resourcequotas	The total number of resource quotas that can exist in the namespace.
services	The total number of services that can exist in the namespace.
services.loadbalancers	The total number of services of type load balancer that can exist in the namespace.
services.nodeports	The total number of services of type node port that can exist in the namespace.
secrets	The total number of secrets that can exist in the namespace.

例如,pod配额限制了一个名称空间下非terminal状态的pod总数量.这样能够防止一个用户建立太多小的pod以致于耗尽集群分配给pod的全部IP

配额范围

每个配额均可以包含一系列相关的范围.配额只会在匹配列举出的范围的交集时才计算资源的使用.

当一个范围被添加到配额里,它将限制它支持的,属于范围的资源.指定的资源不在支持的集合里时,将会致使验证错误

Scope	Description
Terminating	Match pods where .spec.activeDeadlineSeconds >= 0
NotTerminating	Match pods where .spec.activeDeadlineSeconds is nil
BestEffort	Match pods that have best effort quality of service.
NotBestEffort	Match pods that do not have best effort quality of service.

BestEffort范围限制配额只追踪pods资源

Terminating,NotTerminating和NotBestEffort范围限制配额追踪如下资源:

cpu
limits.cpu
limits.memory
memory
pods
requests.cpu
requests.memory

每个PriorityClass的资源配额

此特征在1.12片本中为beta

pod能够以指定的优先级建立.你能够经过pod的优先级来控制pod对系统资源的使用,它是经过配额的spec下的scopeSelector字段产生效果的.

只有当配额spec的scopeSelector选择了一个pod,配额才会被匹配和消费

你在使用PriorityClass的配额的以前,须要启用ResourceQuotaScopeSelectors

如下示例建立一个配额对象,而且必定优先级的pod会匹配它.

集群中的pod有如下三个优先级类之一:low,medium,high
每一个优先级类都建立了一个资源配额

apiVersion: v1
kind: List
items:
- apiVersion: v1
  kind: ResourceQuota
  metadata:
    name: pods-high
  spec:
    hard:
      cpu: "1000"
      memory: 200Gi
      pods: "10"
    scopeSelector:
      matchExpressions:
      - operator : In
        scopeName: PriorityClass
        values: ["high"]
- apiVersion: v1
  kind: ResourceQuota
  metadata:
    name: pods-medium
  spec:
    hard:
      cpu: "10"
      memory: 20Gi
      pods: "10"
    scopeSelector:
      matchExpressions:
      - operator : In
        scopeName: PriorityClass
        values: ["medium"]
- apiVersion: v1
  kind: ResourceQuota
  metadata:
    name: pods-low
  spec:
    hard:
      cpu: "5"
      memory: 10Gi
      pods: "10"
    scopeSelector:
      matchExpressions:
      - operator : In
        scopeName: PriorityClass
        values: ["low"]

使用kubectl create来用户以上yml文件

kubectl create -f ./quota.yml
resourcequota/pods-high created
resourcequota/pods-medium created
resourcequota/pods-low created

使用kubectl describe quota来查看

kubectl describe quota
Name:       pods-high
Namespace:  default
Resource    Used  Hard
--------    ----  ----
cpu         0     1k
memory      0     200Gi
pods        0     10


Name:       pods-low
Namespace:  default
Resource    Used  Hard
--------    ----  ----
cpu         0     5
memory      0     10Gi
pods        0     10


Name:       pods-medium
Namespace:  default
Resource    Used  Hard
--------    ----  ----
cpu         0     10
memory      0     20Gi
pods        0     10

建立一个具备high优先级的pod,把如下内容保存在high-priority-pod.yml里

apiVersion: v1
kind: Pod
metadata:
  name: high-priority
spec:
  containers:
  - name: high-priority
    image: ubuntu
    command: ["/bin/sh"]
    args: ["-c", "while true; do echo hello; sleep 10;done"]
    resources:
      requests:
        memory: "10Gi"
        cpu: "500m"
      limits:
        memory: "10Gi"
        cpu: "500m"
  priorityClassName: high

使用kubectl create来应用

kubectl create -f ./high-priority-pod.yml

这时候再用kubectl describe quota来查看

Name:       pods-high
Namespace:  default
Resource    Used  Hard
--------    ----  ----
cpu         500m  1k
memory      10Gi  200Gi
pods        1     10


Name:       pods-low
Namespace:  default
Resource    Used  Hard
--------    ----  ----
cpu         0     5
memory      0     10Gi
pods        0     10


Name:       pods-medium
Namespace:  default
Resource    Used  Hard
--------    ----  ----
cpu         0     10
memory      0     20Gi
pods        0     10

scopeSelector支持operator字段的如下值:

In
NotIn
Exist
DoesNotExist

配额资源的申请与限制

当分配计算机资源时,每个容器可能会指定对cpu或者内存的申请或限制.配额能够配置为它们中的一个值

这里是说配额只能是申请或者限制,而不能同时出现

若是配额指定了requests.cpu或requests.memory那么它须要匹配的容器必须显式指定申请这些资源.若是配额指定了limits.cpu或limits.memory,那么它须要匹配的容器必须显式指定限制这些资源

查看和设置配额

kubectl支持建立,更新和查看配额

kubectl create namespace myspace

cat <<EOF > compute-resources.yaml
apiVersion: v1
kind: ResourceQuota
metadata:
  name: compute-resources
spec:
  hard:
    pods: "4"
    requests.cpu: "1"
    requests.memory: 1Gi
    limits.cpu: "2"
    limits.memory: 2Gi
    requests.nvidia.com/gpu: 4
EOF

kubectl create -f ./compute-resources.yaml --namespace=myspace

cat <<EOF > object-counts.yaml
apiVersion: v1
kind: ResourceQuota
metadata:
  name: object-counts
spec:
  hard:
    configmaps: "10"
    persistentvolumeclaims: "4"
    replicationcontrollers: "20"
    secrets: "10"
    services: "10"
    services.loadbalancers: "2"
EOF

kubectl create -f ./object-counts.yaml --namespace=myspace

kubectl get quota --namespace=myspace

NAME                    AGE
compute-resources       30s
object-counts           32s

kubectl describe quota compute-resources --namespace=myspace

Name:                    compute-resources
Namespace:               myspace
Resource                 Used  Hard
--------                 ----  ----
limits.cpu               0     2
limits.memory            0     2Gi
pods                     0     4
requests.cpu             0     1
requests.memory          0     1Gi
requests.nvidia.com/gpu  0     4

kubectl describe quota object-counts --namespace=myspace

Name:                   object-counts
Namespace:              myspace
Resource                Used    Hard
--------                ----    ----
configmaps              0       10
persistentvolumeclaims  0       4
replicationcontrollers  0       20
secrets                 1       10
services                0       10
services.loadbalancers  0       2

kubectl经过count/<resource>.<group>语法形式支持标准名称空间对象数量配额

kubectl create namespace myspace

kubectl create quota test --hard=count/deployments.extensions=2,count/replicasets.extensions=4,count/pods=3,count/secrets=4 --namespace=myspace

kubectl run nginx --image=nginx --replicas=2 --namespace=myspace

kubectl describe quota --namespace=myspace

Name:                         test
Namespace:                    myspace
Resource                      Used  Hard
--------                      ----  ----
count/deployments.extensions  1     2
count/pods                    2     3
count/replicasets.extensions  1     4
count/secrets                 1     4

配额和集群容量

ResourceQuotas独立于集群的容量,它们经过绝对的单位表示.所以,若是你向集群添加了节点,这并不会给集群中的每一个名称空间赋予消费更多资源的能力.

有时候须要更为复杂的策略,好比:

把集群中全部的资源按照比例分配给不一样团队
容许每一个租户根据需求增长资源使用,可是有一个整体的限制以防资源被耗尽
检测名称空间的需求,添加节点,增长配额

这些策略能够经过实现ResourceQuotas来写一个controller用于监视配额的使用,而且经过其它信号来调整每一个名称空间的配额

默认限制优先类消费

有时候咱们可能但愿必定优先级别的pod,例如cluster-services应当被容许在一个名称空间里,当且仅当匹配的配额存在.

经过这种机制,operators能够限制一些高优先级的类只能用于有限数量的名称空间里,而且不是全部的名称空间均可以默认消费它们.

为了使以上生效,kube-apiserver标签--admission-control-config-file应当传入如下配置文件的路径

apiVersion: apiserver.k8s.io/v1alpha1
kind: AdmissionConfiguration
plugins:
- name: "ResourceQuota"
  configuration:
    apiVersion: resourcequota.admission.k8s.io/v1beta1
    kind: Configuration
    limitedResources:
    - resource: pods
      matchScopes:
      - scopeName: PriorityClass 
        operator: In
        values: ["cluster-services"]

如今,cluster-services类型的pod仅被容许运行在有匹配scopeSelector的配额资源对象的名称空间里,例如

`yml scopeSelector: matchExpressions: - scopeName: PriorityClass operator: In values: ["cluster-services"]