一次kuberneets evicted的历险

1、概述node

  kubernetes 的eviction检测diskpresure,检测的是kubelet的root-dir。kubelet的默认root-dir是/var/lib/kubelet,能够使用参数--root-dir进行修改,源码:git

    kubernetes/cmd/kubelet/app/options/options.gogithub

   

const defaultRootDir = "/var/lib/kubelet" fs.StringVar(&f.RootDirectory, "root-dir", f.RootDirectory, "Directory path for managing kubelet files (volume mounts,etc).")

 kubernetes/pkg/kubelet/eviction/helpers.gojson

  

// diskUsage converts used bytes into a resource quantity.
func diskUsage(fsStats *statsapi.FsStats) *resource.Quantity { if fsStats == nil || fsStats.UsedBytes == nil { return &resource.Quantity{Format: resource.BinarySI} } usage := int64(*fsStats.UsedBytes) return resource.NewQuantity(usage, resource.BinarySI) } // rankDiskPressureFunc returns a rankFunc that measures the specified fs stats.
func rankDiskPressureFunc(fsStatsToMeasure []fsStatsType, diskResource v1.ResourceName) rankFunc { return func(pods []*v1.Pod, stats statsFunc) { orderedBy(exceedDiskRequests(stats, fsStatsToMeasure, diskResource), priority, disk(stats, fsStatsToMeasure, diskResource)).Sort(pods) } } if nodeFs := summary.Node.Fs; nodeFs != nil { if nodeFs.AvailableBytes != nil && nodeFs.CapacityBytes != nil { result[evictionapi.SignalNodeFsAvailable] = signalObservation{ available: resource.NewQuantity(int64(*nodeFs.AvailableBytes), resource.BinarySI), capacity: resource.NewQuantity(int64(*nodeFs.CapacityBytes), resource.BinarySI), time: nodeFs.Time, } }
type NodeStats struct { // Reference to the measured Node.
    NodeName string `json:"nodeName"` // Stats of system daemons tracked as raw containers. // The system containers are named according to the SystemContainer* constants. // +optional // +patchMergeKey=name // +patchStrategy=merge
    SystemContainers []ContainerStats `json:"systemContainers,omitempty" patchStrategy:"merge" patchMergeKey:"name"` // The time at which data collection for the node-scoped (i.e. aggregate) stats was (re)started.
    StartTime metav1.Time `json:"startTime"` // Stats pertaining to CPU resources. // +optional
    CPU *CPUStats `json:"cpu,omitempty"` // Stats pertaining to memory (RAM) resources. // +optional
    Memory *MemoryStats `json:"memory,omitempty"` // Stats pertaining to network resources. // +optional
    Network *NetworkStats `json:"network,omitempty"` // Stats pertaining to total usage of filesystem resources on the rootfs used by node k8s components. // NodeFs.Used is the total bytes used on the filesystem. // +optional
    Fs *FsStats `json:"fs,omitempty"` // Stats about the underlying container runtime. // +optional
    Runtime *RuntimeStats `json:"runtime,omitempty"` // Stats about the rlimit of system. // +optional
    Rlimit *RlimitStats `json:"rlimit,omitempty"` }

 

2、事故app

   事情发生在几个月前,有人修改了fluentd的pattern,fluentd使用ds部署的,里面有挂载了一个hostpath,/var/log.里面的日志会输出到syslog里面。致使pattern不匹配的日志所有打入到/var/log/syslog里面,一个小时写入了7个多G。后面磁盘使用率直接达到了90%,而咱们在kubelet里面设置的驱逐策略以下:flex

  

evictionHard: imagefs.available: 15% memory.available: 100Mi nodefs.available: 10% nodefs.inodesFree: 5%

当kubelet的root-dir所在的磁盘使用率达到90%就开始evicted,这个fluentd是没有报错的,只是pattern不匹配而后就把日志输出到了sysylog,因此使用的时候必定要设置好日志的输出路径和日志的输出级别。spa

 

3、善后3d

经过分析源码得出结论,紧急恢复服务。(系统盘的告警阈值没有减掉kubelet里面设置的驱逐阈值)。从新规划监控阈值,线上的node节点设置特性,不一样的业务部署在不一样node节点上。日志

相关文章
相关标签/搜索