在生产环境中,常常遇到docker image 在资源池中的主机上存留的数据,因为随着业务系统的升级,旧的image 须要进行清理。这里梳理下,docker image的在linux 系统上的存储目录,以针对性的进行数据清理。python
在3.10内核上 docker基于aufs管理存储linux
下面的命令能够看到全部pull到的imagesgit
cat /var/lib/docker/repositories-aufs | python -m json.toolgithub
该命令的结果与docker images看到的images数目相同docker
cat /var/lib/docker/repositories-aufs | python -m json.tool
{
"Repositories": {
"172.30.30.241:5000/centos1": {
"latest": "e099197b794f459b777cc82ba53f2ecdcfb52c0a3245a9b010ca239b50fd72ad"
},
"centos": {
"6": "b9aeeaeb5e17b5414e5caa9a6b2f99e9ccef50561bdfe137cd05956961f1cec6",
"latest": "fd44297e2ddb050ec4fa9752b7a4e3a8439061991886e2091e7c1f007c906d75",
"new": "8390535f8e613861a3715cf1af4a082ac80108c1d098944def5aa1391207e33a",
"new2": "e099197b794f459b777cc82ba53f2ecdcfb52c0a3245a9b010ca239b50fd72ad"
},
"hello-world": {
"latest": "91c95931e552b11604fea91c2f537284149ec32fff0f700a4769cfd31d7696ae"
},
"quay.io/coreos/etcd": {
"v2.0.11": "c02fd8670851ce85ace68db5cff8694a3ed3656bedd9fa8054de8aff2f39e631"
},
"registry": {
"latest": "204704ce31375bcf4afecf672563b4881bbef0d59135c68d273235bb7254fb4b"
},
"ubuntu": {
"14.04": "07f8e8c5e66084bef8f848877857537ffe1c47edd01a93af27e7161672ad0e95"
}
}
}json
上述出现的image的ID全局惟一和image hub上相同ubuntu
能够看到经过docker images看到的IMAGE ID实际就是上面的ID的前几位centos
docker images
REPOSITORY TAG IMAGE ID CREATED VIRTUAL SIZE
quay.io/coreos/etcd v2.0.11 c02fd8670851 11 days ago 12.83 MB
172.30.30.241:5000/centos1 latest e099197b794f 12 days ago 306.1 MB
centos new2 e099197b794f 12 days ago 306.1 MB
centos new 8390535f8e61 12 days ago 306.1 MB
registry latest 204704ce3137 13 days ago 413.9 MB
ubuntu 14.04 07f8e8c5e660 3 weeks ago 188.3 MB
centos 6 b9aeeaeb5e17 5 weeks ago 202.6 MB
centos latest fd44297e2ddb 5 weeks ago 215.7 MB
hello-world latest 91c95931e552 5 weeks ago 910 Bbash
docker images -tree能够看到image的层级结构, docker 的image是一层层叠加的, -tree参数能够看到具体的叠加方式。测试
经过-tree看到的image的数目比上面的要多不少,出现了不少没有见过的imageID,
可是tree中第0层的image ID和 docker image命令的中的ID相符
docker images -tree
Warning: '-tree' is deprecated, it will be removed soon. See usage.
?..8093db4276d5 Virtual Size: 0 B
?.?..f9c3a06edd7a Virtual Size: 6.642 MB
?. ?..546a4b0d3153 Virtual Size: 12.83 MB
?. ?..9caa77989e25 Virtual Size: 12.83 MB
?. ?..c02fd8670851 Virtual Size: 12.83 MB Tags: quay.io/coreos/etcd:v2.0.11
?..e9e06b06e14c Virtual Size: 188.1 MB
?.?..a82efea989f9 Virtual Size: 188.3 MB
?. ?..37bea4ee0c81 Virtual Size: 188.3 MB
?. ?..07f8e8c5e660 Virtual Size: 188.3 MB Tags: ubuntu:14.04
?. ?..1f4ab7282e19 Virtual Size: 375.1 MB
?. ?..0e4483abe66b Virtual Size: 377.5 MB
?. ?..c6153b5d8f1f Virtual Size: 377.5 MB
?. ?..2bc4611f2ed7 Virtual Size: 389.1 MB
?. ?..30887473610f Virtual Size: 413.9 MB
?. ?..3f8e22c413b1 Virtual Size: 413.9 MB
?. ?..22b1c756fa19 Virtual Size: 413.9 MB
?. ?..90607d8d09d1 Virtual Size: 413.9 MB
?. ?..4f4a5acb19eb Virtual Size: 413.9 MB
?. ?..204704ce3137 Virtual Size: 413.9 MB Tags: registry:latest
?..f1b10cd84249 Virtual Size: 0 B
?.?..b9aeeaeb5e17 Virtual Size: 202.6 MB Tags: centos:6
?. ?..8390535f8e61 Virtual Size: 306.1 MB Tags: centos:new
?. ?..0570b3aa38fb Virtual Size: 306.1 MB
?. ?..e099197b794f Virtual Size: 306.1 MB Tags: 172.30.30.241:5000/centos1:latest, centos:new2
?..6941bfcbbfca Virtual Size: 0 B
?.?..41459f052977 Virtual Size: 215.7 MB
?. ?..fd44297e2ddb Virtual Size: 215.7 MB Tags: centos:latest
?..a8219747be10 Virtual Size: 910 B
?..91c95931e552 Virtual Size: 910 B Tags: hello-world:latest
咱们到graph目录
/var/lib/docker/graph
能够看到graph目录的imageID数目和 docker images -tree看到的相同
进入到每一个image ID的目录,每一个目录下有两个文件
json layersize
其中json文件描述了image的元数据,layersize只有一个值,表示了这个层级的image的大小,能够为0
ls /var/lib/docker/graph/
0570b3aa38fbdd1defba6929282656a39cfabcf70dc39e68848139d649d8921b 41459f052977938b824dd011e1f2bec2cb4d133dfc7e1aa0e90f7c5d337ca9c4 a82efea989f94b1d9fac76e26e37b0bbde11047a3afcaa47064949dfa3b3209b
07f8e8c5e66084bef8f848877857537ffe1c47edd01a93af27e7161672ad0e95 9caa77989e25e788a5a75faff4e77b011e68d2fd5975a0bd20f5d14f61154bc0 fd44297e2ddb050ec4fa9752b7a4e3a8439061991886e2091e7c1f007c906d75
3f8e22c413b1783145e785a4729c4d5f98f9baca025b74d73774ed438ac82ba2 a8219747be10611d65b7c693f48e7222c0bf54b5df8467d3f99003611afa1fd8 _tmp
/var/lib/docker/graph/0570b3aa38fbdd1defba6929282656a39cfabcf70dc39e68848139d649d8921b# ls
json layersize
/var/lib/docker/aufs/diff
能够看到这个目录的全部文件都是ImageID而且与graph目录相对应
根据image tree,进入到某个imageID的0层目录,可能没有数据,再往上,能够看到文件系统的目录,就是咱们操做docker生成的文件
进过简单的测试,咱们能够发现,
当咱们pull 一个images时, /var/lib/docker/aufs/diff 目录下会多一个image ID
若是此时再根据该image建立文件会出现一个容器ID前缀的目录,以及ID相同带有-init结尾的目录
/var/lib/docker/aufs/mnt/这一层是多层diff的view
当咱们在容器中操做时,建立的数据会保存在该容器ID的目录中,容器退出时,该目录不会删除。每次 根据image运行一次都会建立新的容器ID的两个目录。docker ps -a能够看到退出状态的容器,执行删除后,这两个目录消失。当咱们commit修改后的容器时,又会生成一个新的目录。
docker images tree看到的结构 image从0层到最后一层或者tag,都是直接从hub上pull下来的各类imageID层次。
当容器建立时有VOLUME,或者-v启动,会在目录/var/lib/docker/volumes下建立一个随机ID的目录,并将这个随机ID添加到容器的元数据中
容器启动时,会绑定/var/lib/docker/volumes/*/到指定的路径中,volumes不是运行时绑定的,只是挂载的目录。
在新版本的docker中,docker使用vfs driver存储数据 /var/lib/docker/volumes/只存储元数据,实际数据存储在/var/lib/docker/vfs/dir/<volume id>
每当docker建立一个数据卷时,就会在目录/var/lib/docker/vfs/dir/* 下建立一个随机ID的目录,表示这个数据卷,若是数据卷不是和host共享的,写入数据卷中的数据会在这个目录。 元数据信息在/var/lib/docker/volumes/*中。
总结:
每当建立一个pull一个新的image或者容器时会在/var/lib/docker/graph/*生成对应ID的目录,存储元数据,/var/lib/docker/aufs/diff/*目录生成对应ID的目录,存储数据。
当容器被删除或者image被移除时,对应的目录也会被移除。
每当docker建立一个数据卷时,就会在目录/var/lib/docker/vfs/dir/* 下建立一个随机ID的目录,表示这个数据卷,若是数据卷不是和host共享的,写入数据卷中的数据会在这个目录。 元数据信息在/var/lib/docker/volumes/*中。
docker rm -v containerid 当指定-v时,若是卷没有关联容器,会删除该卷的数据。
可能有些状况下,一些异常或者docker rm -v潜在的bug,致使存储空间有残留,这种状况就须要手动清理。
1 对于image或者容器数据文件: docker image -a 以及docker ps -a 看到的ID是全部有效的文件ID,/var/lib/docker/aufs/diff/下的ID不在这个范围中的就是失效的文件。 这种状况状况其实比较少。
2 对于数据卷, 经过docker inspect 查看全部的容器id会获得全部合法的volume的范围,/var/lib/docker/vfs/dir/不在这个范围内的的ID就是失效文件。这种状况比较多,由于删除容器时可能没有指定-v,在docker低版本中好像有bug,不能删除卷数据。
脚本2来源于: https://github.com/chadoe/docker-cleanup-volumes/blob/master/docker-cleanup-volumes.sh
对应的脚本分别为:
cat docker_images_clean.sh
#! /bin/bash set -eou pipefail #usage: sudo ./docker-cleanup-volumes.sh [--dry-run] docker_bin="$(which docker.io 2> /dev/null || which docker 2> /dev/null)" # Default dir dockerdir=/var/lib/docker # Look for an alternate docker directory with -g/--graph option dockerpid=$(ps ax | grep "$docker_bin" | grep -v grep | awk '{print $1; exit}') || : if [[ -n "$dockerpid" && $dockerpid -gt 0 ]]; then next_arg_is_dockerdir=false while read -d $'\0' arg do if [[ $arg =~ ^--graph=(.+) ]]; then dockerdir=${BASH_REMATCH[1]} break elif [ $arg = '-g' ]; then next_arg_is_dockerdir=true elif [ $next_arg_is_dockerdir = true ]; then dockerdir=$arg break fi done < /proc/$dockerpid/cmdline fi dockerdir=$(readlink -f "$dockerdir") volumesdir=${dockerdir}/volumes vfsdir=${dockerdir}/vfs/dir allvolumes=() dryrun=false verbose=false function log_verbose() { if [ "${verbose}" = true ]; then echo "$1" fi; } function delete_volumes() { local targetdir=$1 echo if [[ ! -d "${targetdir}" || ! "$(ls -A "${targetdir}")" ]]; then echo "Directory ${targetdir} does not exist or is empty, skipping." return fi echo "Delete unused volume directories from $targetdir" local dir while read -d $'\0' dir do dir=$(basename "$dir") if [[ -d "${targetdir}/${dir}/_data" || "${dir}" =~ [0-9a-f]{64} ]]; then if [ ${#allvolumes[@]} -gt 0 ] && [[ ${allvolumes[@]} =~ "${dir}" ]]; then echo "In use ${dir}" else if [ "${dryrun}" = false ]; then echo "Deleting ${dir}" rm -rf "${targetdir}/${dir}" else echo "Would have deleted ${dir}" fi fi else echo "Not a volume ${dir}" fi done < <(find "${targetdir}" -mindepth 1 -maxdepth 1 -type d -print0 2>/dev/null) } if [ $UID != 0 ]; then echo "You need to be root to use this script." exit 1 fi if [ -z "$docker_bin" ] ; then echo "Please install docker. You can install docker by running \"wget -qO- https://get.docker.io/ | sh\"." exit 1 fi while [[ $# > 0 ]] do key="$1" case $key in -n|--dry-run) dryrun=true ;; -v|--verbose) verbose=true ;; *) echo "Cleanup docker volumes: remove unused volumes." echo "Usage: ${0##*/} [--dry-run] [--verbose]" echo " -n, --dry-run: dry run: display what would get removed." echo " -v, --verbose: verbose output." exit 1 ;; esac shift done # Make sure that we can talk to docker daemon. If we cannot, we fail here. ${docker_bin} version >/dev/null container_ids=$(${docker_bin} ps -a -q --no-trunc) #All volumes from all containers SAVEIFS=$IFS IFS=$(echo -en "\n\b") for container in $container_ids; do #add container id to list of volumes, don't think these #ever exists in the volumesdir but just to be safe allvolumes+=${container} #add all volumes from this container to the list of volumes log_verbose "Inspecting container ${container}" for volpath in $( ${docker_bin} inspect --format='{{range $key, $val := .}}{{if eq $key "Volumes"}}{{range $vol, $path := .}}{{$path}}{{"\n"}}{{end}}{{end}}{{if eq $key "Mounts"}}{{range $mount := $val}}{{$mount.Source}}{{"\n"}}{{end}}{{end}}{{end}}' ${container} \ ); do log_verbose "Processing volumepath ${volpath}" #try to get volume id from the volume path vid=$(echo "${volpath}" | sed 's|.*/\(.*\)/_data$|\1|;s|.*/\([0-9a-f]\{64\}\)$|\1|') # check for either a 64 character vid or then end of a volumepath containing _data: if [[ "${vid}" =~ ^[0-9a-f]{64}$ || (${volpath} =~ .*/_data$ && ! "${vid}" =~ "/") ]]; then log_verbose "Found volume ${vid}" allvolumes+=("${vid}") else #check if it's a bindmount, these have a config.json file in the ${volumesdir} but no files in ${vfsdir} (docker 1.6.2 and below) for bmv in $(find "${volumesdir}" -name config.json -print | xargs grep -l "\"IsBindMount\":true" | xargs grep -l "\"Path\":\"${volpath}\""); do bmv="$(basename "$(dirname "${bmv}")")" log_verbose "Found bindmount ${bmv}" allvolumes+=("${bmv}") #there should be only one config for the bindmount, delete any duplicate for the same bindmount. break done fi done done IFS=$SAVEIFS delete_volumes "${volumesdir}" delete_volumes "${vfsdir}"
cat docker_volumes_clean.sh
#! /bin/bash set -eo pipefail #usage: sudo ./docker-cleanup-volumes.sh [--dry-run] dockerdir=/var/lib/docker volumesdir=${dockerdir}/volumes vfsdir=${dockerdir}/vfs/dir allvolumes=() dryrun=false function delete_volumes() { targetdir=$1 echo if [[ ! -d ${targetdir} ]]; then echo "Directory ${targetdir} does not exist, skipping." return fi echo "Delete unused volume directories from $targetdir" for dir in $(ls -d ${targetdir}/* 2>/dev/null) do dir=$(basename $dir) if [[ "${dir}" =~ [0-9a-f]{64} ]]; then if [[ ${allvolumes[@]} =~ "${dir}" ]]; then echo In use ${dir} else if [ "${dryrun}" = false ]; then echo Deleting ${dir} rm -rf ${targetdir}/${dir} else echo Would have deleted ${dir} fi fi else echo Not a volume ${dir} fi done } if [ $UID != 0 ]; then echo "You need to be root to use this script." exit 1 fi docker_bin=$(which docker.io || which docker) if [ -z "$docker_bin" ] ; then echo "Please install docker. You can install docker by running \"wget -qO- https://get.docker.io/ | sh\"." exit 1 fi if [ "$1" = "--dry-run" ]; then dryrun=true else if [ -n "$1" ]; then echo "Cleanup docker volumes: remove unused volumes." echo "Usage: ${0##*/} [--dry-run]" echo " --dry-run: dry run: display what would get removed." exit 1 fi fi # Make sure that we can talk to docker daemon. If we cannot, we fail here. docker info >/dev/null #All volumes from all containers for container in `${docker_bin} ps -a -q --no-trunc`; do #add container id to list of volumes, don't think these #ever exists in the volumesdir but just to be safe allvolumes+=${container} #add all volumes from this container to the list of volumes for vid in `${docker_bin} inspect --format='{{range $vol, $path := .Volumes}}{{$path}}{{"\n"}}{{end}}' ${container}`; do if [[ ${vid} == ${vfsdir}* && "${vid##*/}" =~ [0-9a-f]{64} ]]; then allvolumes+=("${vid##*/}") else #check if it's a bindmount, these have a config.json file in the ${volumesdir} but no files in ${vfsdir} for bmv in `grep --include config.json -Rl "\"IsBindMount\":true" ${volumesdir} | xargs grep -l "\"Path\":\"${vid}\""`; do bmv="$(basename "$(dirname "${bmv}")")" allvolumes+=("${bmv}") #there should be only one config for the bindmount, delete any duplicate for the same bindmount. break done fi done done delete_volumes ${volumesdir} delete_volumes ${vfsdir}
资料
http://www.csdn.net/article/2014-11-18/2822693
http://stackoverflow.com/questions/24353387/how-docker-container-volumes-work-even-when-they-arent-running
http://www.infoq.com/cn/articles/docker-source-code-analysis-part4
https://github.com/docker/docker/issues/3925
https://github.com/chadoe/docker-cleanup-volumes/blob/master/docker-cleanup-volumes.sh