在JupyterHub for K8s使用GlusterFS存储

在Kubernetes中使用GlusterFS(https://www.gluster.org/)有endpoint(外置存储)和heketi(k8s内置GlusterFS服务)两种方式。这里主要介绍使用endpoint将GlusterFS存储设置到JupyterHub for K8s中使用。为了简化起见,使用缺省的JupyterHub helm进行安装。按照《快速设置JupyterHub for K8s》安装后,会在Jupyterhub的安装命名空间下出现一个hub-db-dir的pvc,我将使用GlusterFS的volume来提供这个pvc。node

一、建立GlusterFS的volume的endpoint

保存下面内容到文件0a-glusterfs-gvzr00-endpoint-jupyter.yaml:python

apiVersion: v1
kind: Endpoints
metadata:
  name: glusterfs-gvzr00
  namespace: jhub
subsets:
- addresses:
  - ip: 10.1.1.193
  - ip: 10.1.1.205
  - ip: 10.1.1.112
  ports:
  - port: 10000
    protocol: TCP
  • 其中的addresss是本身的GlusterFS的对应复制卷的peer节点访问地址(这里是三节点复制提供冗余存储)。

建立服务,保存下面内容到文件0b-glusterfs-gvzr00-service-jupyter.yaml :web

apiVersion: v1
kind: Service
metadata:
  name: glusterfs-gvzr00
  namespace: jhub
spec:
  ports:
  - port: 10000
    protocol: TCP
    targetPort: 10000
  sessionAffinity: None
  type: ClusterIP

二、建立jupyterhub系统的hub-db-dir的pv和pvc

建立jupyterhub主服务程序的pv和pvc,用于存储系统数据。docker

2.1 建立pv

保存下面内容到文件 1a-glusterfs-gvzr00-pv-jupyter-hub.yaml:json

apiVersion: v1
kind: PersistentVolume
metadata:
  name: hub-db-dir
  namespace: jhub
spec:
  capacity:
    storage: 8Gi
  accessModes:
    - ReadWriteMany
  glusterfs:
    endpoints: "glusterfs-gvzr00"
    path: "gvzr00/jupyterhub/hub-db-dir"
    readOnly: false

2.2 建立pvc

首先删除pvc。api

kubectl delete pvc/hub-db-dir -n jhub

保存下面内容到文件1b-glusterfs-gvzr00-pvc-jupyter-hub.yaml:bash

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: hub-db-dir
  namespace: jhub
spec:
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 8Gi

三、建立jupyterhub系统的supermap的pv和pvc

每一个用户本身的pv和pvc,用于存储notebook server的用户数据。session

3.1 建立pv

保存下面内容到文件2a-glusterfs-gvzr00-pv-jupyter-supermap.yaml:app

apiVersion: v1
kind: PersistentVolume
metadata:
  name: claim-supermap
  namespace: jhub
spec:
  capacity:
    storage: 16Gi
  accessModes:
    - ReadWriteMany
  glusterfs:
    endpoints: "glusterfs-gvzr00"
    path: "gvzr00/jupyterhub/claim-supermap"
    readOnly: false

3.2 建立pvc

保存下面内容到文件2b-glusterfs-gvzr00-pvc-jupyter-supermap.yaml:分布式

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: claim-supermap
  namespace: jhub
spec:
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 16Gi

四、运行设置

根据本身的集群地址和存储容量修改上面的几个文件。

4.1 建立pv和pvc

保存下面内容到文件apply.sh:

echo "Create endpoint and svc, glusterfs-gvzr00 ..."
kubectl apply -f 0a-glusterfs-gvzr00-endpoint-jupyter.yaml
kubectl apply -f 0b-glusterfs-gvzr00-service-jupyter.yaml

echo "Create pv and pvc, hub-db-dir ..."
kubectl apply -f 1a-glusterfs-gvzr00-pv-jupyter-hub.yaml
kubectl apply -f 1b-glusterfs-gvzr00-pvc-jupyter-hub.yaml

echo "Create pv and pvc, claim--supermap ..."
kubectl apply -f 2a-glusterfs-gvzr00-pv-jupyter-supermap.yaml
kubectl apply -f 2b-glusterfs-gvzr00-pvc-jupyter-supermap.yaml

echo "Finished."
echo ""

而后运行apply.sh。

4.2 删除pv和pvc

保存下面内容到文件delete.sh

# 
echo "Delete pv and pvc, hub-db-dir ..."
kubectl delete pvc/hub-db-dir -n jhub
kubectl delete pv/hub-db-dir -n jhub

echo "Delete pv and pvc, claim--supermap ..."
kubectl delete pvc/claim-supermap -n jhub
kubectl delete pv/claim--supermap -n jhub

echo "Delete endpoint and svc, glusterfs-gvzr00 ..."
kubectl delete svc/glusterfs-gvzr00 -n jhub
kubectl delete ep/glusterfs-gvzr00 -n jhub

echo "Finished."
echo ""

须要所有删除时,运行delete.sh。

4.3 查看pv和pvc

经过Dashboard或命令:

kubectl get pv -n jhub

kubectl get pvc -n jhub
  • 注意:若是升级到最近的python3版本,JupyterHub运行时出现错误,Notebook Server没法启动,进pod日志发现提示信息"NoneType",能够采用下面的方法修复:

kubectl patch deploy -n jhub hub --type json \
--patch '[{"op": "replace", "path": "/spec/template/spec/containers/0/command", "value": ["bash", "-c", "\nmkdir -p ~/hotfix\ncp \
-r /usr/local/lib/python3.6/dist-packages/kubespawner ~/hotfix\nls -R ~/hotfix\npatch ~/hotfix/kubespawner/spawner.py \
<< EOT\n72c72\n<             key=lambda x: x.last_timestamp,\n---\n>             key=lambda x: x.last_timestamp and x.last_timestamp.timestamp() or 0.,\nEOT\n\nPYTHONPATH=$HOME/hotfix \
jupyterhub --config /srv/jupyterhub_config.py --upgrade-db\n"]}]'

再去访问JupyterHub的服务,恢复正常。

五、更多参考