NVidia对Linux支持最近进步挺大的,Docker和Kubernetes能够直接使用GPU能力。NVidia最新的显卡驱动是440.31,而Ubuntu 18.04的内置库也到了430版本,CUDA到了10.1版本。git
Docker中使用GPU原来是须要安装nvidia-docker2的(方法在下面),已经不须要了:github
Kubernetes中的容器也能够直接使用GPU了。以下:docker
#### Test nvidia-smi with the latest official CUDA image $ docker run --gpus all nvidia/cuda:9.0-base nvidia-smi # Start a GPU enabled container on two GPUs $ docker run --gpus 2 nvidia/cuda:9.0-base nvidia-smi # Starting a GPU enabled container on specific GPUs $ docker run --gpus '"device=1,2"' nvidia/cuda:9.0-base nvidia-smi $ docker run --gpus '"device=UUID-ABCDEF,1"' nvidia/cuda:9.0-base nvidia-smi # Specifying a capability (graphics, compute, ...) for my container # Note this is rarely if ever used this way $ docker run --gpus all,capabilities=utility nvidia/cuda:9.0-base nvidia-smi
问题:ubuntu
直接下载:curl
wget -c http://us.download.nvidia.com/XFree86/Linux-x86_64/440.31/NVIDIA-Linux-x86_64-440.31.run
若是之前安装过NVidia的驱动,须要先卸载,而后再安装。参考:测试
AS:this
sudo apt-get --purge remove nvidia-* # sudo ./NVIDIA-Linux-x86_64-410.57.run -uninstall sudo update-initramfs -u sudo reboot now
在Ubuntu上,执行:url
wget -c https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin sudo mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600 wget -c http://developer.download.nvidia.com/compute/cuda/10.1/Prod/local_installers/cuda-repo-ubuntu1804-10-1-local-10.1.243-418.87.00_1.0-1_amd64.deb sudo dpkg -i cuda-repo-ubuntu1804-10-1-local-10.1.243-418.87.00_1.0-1_amd64.deb sudo apt-key add /var/cuda-repo-10-1-local-10.1.243-418.87.00/7fa2af80.pub sudo apt-get update sudo apt-get -y install cuda
Docker版本(须要指定runtime):spa
docker run --runtime=nvidia --rm nvidia/cuda nvidia-smi
原来的--runtime=nvidia也能运行(需安装nvidia-docker2),但最新的版本使用--gpus参数(不须要安装nvidia-docker2)。操作系统
在Ubuntu 18.04上运行apt update时出现下面的错误信息:
“没法下载 https://nvidia.github.io/nvidia-container-runtime/ubuntu18.04/amd64/InRelease 因为没有公钥,没法验证下列签名: NO_PUBKEY xxx"
估计是之前版本的pubkey过时了,解决办法:
DIST=$(. /etc/os-release; echo $ID$VERSION_ID) curl -s -L https://nvidia.github.io/libnvidia-container/gpgkey | \ sudo apt-key add - curl -s -L https://nvidia.github.io/libnvidia-container/$DIST/libnvidia-container.list | \ sudo tee /etc/apt/sources.list.d/libnvidia-container.list sudo apt-get update
而后,就能够正常更新了。
参考NVidia的主页(https://github.com/NVIDIA/nvidia-docker)。
以下:
docker run --gpus all nvidia/cuda:9.0-base nvidia-smi
安装nvidia-docker2:
# Add the package repositories $ distribution=$(. /etc/os-release;echo $ID$VERSION_ID) $ curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - $ curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list $ sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit $ sudo systemctl restart docker
其它操做系统,参考: