How to Configure NVIDIA GPUs with Docker on Ubuntu: A Comprehensive Guide for AI Deep Learning CUDA Solution

Welcome to our in-depth guide on configuring NVIDIA GPUs with Docker on Ubuntu. This post is tailored for developers, data scientists, and IT professionals who are looking to leverage the power of NVIDIA's GPU acceleration within Docker containers. 

Whether you're working on machine learning projects, scientific computations, or any GPU-intensive tasks, this guide will walk you through the process step-by-step.

This guide will work for all Nvidia GPUs that have a supported driver in Linux such as the GTX, RTX and Tesla series.  Of course the Tesla series is recommended as they have ECC and are more tailored for AI applications.

Normally it is not possible to share a GPU with multiple containers

Typically 1 real/physical GPU can work on one machine, whether physical or a VM that gets exclusive use.  But nvidia has made tools that solve this problem by essentially creating a layer between Docker and the nvidia driver.

There is the old nvidia-docker2 package that NVIDIA has created which allows an unlimited amount of Docker containers to use the underlaying GPU(s) on the host  but this has now been deprecrated for the new "nvidia-toolkit"

nvidia-toolkit is what you want unless there's a reason why you have an older distro that can't use the newer nvidia-toolkit.

nvidia-toolkit official install guide: https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html

How To Install nvidia-toolkit on Mint Ubuntu Debian

Step 1.) Add the Nvidia repo to Linux

First we add the gpg key for the repo:

curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg

Add the repo into our sources.list.d

curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' |  
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

Update apt so it can see the required packages from the nvidia repo

apt update

Step 2 - Install nvidia toolkit

apt-get install -y nvidia-container-toolkit

Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following additional packages will be installed:
  libnvidia-container-tools libnvidia-container1 nvidia-container-toolkit-base
The following NEW packages will be installed:
  libnvidia-container-tools libnvidia-container1 nvidia-container-toolkit nvidia-container-toolkit-base
0 upgraded, 4 newly installed, 0 to remove and 690 not upgraded.
Need to get 4,194 kB of archives.
After this operation, 16.6 MB of additional disk space will be used.
Get:1 https://nvidia.github.io/libnvidia-container/stable/deb/amd64  libnvidia-container1 1.14.3-1 [923 kB]
Get:2 https://nvidia.github.io/libnvidia-container/stable/deb/amd64  libnvidia-container-tools 1.14.3-1 [19.3 kB]
Get:3 https://nvidia.github.io/libnvidia-container/stable/deb/amd64  nvidia-container-toolkit-base 1.14.3-1 [2,336 kB]
Get:4 https://nvidia.github.io/libnvidia-container/stable/deb/amd64  nvidia-container-toolkit 1.14.3-1 [917 kB]
Fetched 4,194 kB in 1s (3,150 kB/s)           
Selecting previously unselected package libnvidia-container1:amd64.
(Reading database ... 467373 files and directories currently installed.)
Preparing to unpack .../libnvidia-container1_1.14.3-1_amd64.deb ...
Unpacking libnvidia-container1:amd64 (1.14.3-1) ...
Selecting previously unselected package libnvidia-container-tools.
Preparing to unpack .../libnvidia-container-tools_1.14.3-1_amd64.deb ...
Unpacking libnvidia-container-tools (1.14.3-1) ...
Selecting previously unselected package nvidia-container-toolkit-base.
Preparing to unpack .../nvidia-container-toolkit-base_1.14.3-1_amd64.deb ...
Unpacking nvidia-container-toolkit-base (1.14.3-1) ...
Selecting previously unselected package nvidia-container-toolkit.
Preparing to unpack .../nvidia-container-toolkit_1.14.3-1_amd64.deb ...
Unpacking nvidia-container-toolkit (1.14.3-1) ...
Setting up nvidia-container-toolkit-base (1.14.3-1) ...
Setting up libnvidia-container1:amd64 (1.14.3-1) ...
Setting up libnvidia-container-tools (1.14.3-1) ...
Setting up nvidia-container-toolkit (1.14.3-1) ...
Processing triggers for libc-bin (2.31-0ubuntu9.2) ...

Step 3 - Docker Configuration

nvidia-ctk runtime configure --runtime=docker


INFO[0000] Loading config from /etc/docker/daemon.json  
INFO[0000] Wrote updated config to /etc/docker/daemon.json
INFO[0000] It is recommended that docker daemon be restarted.

Step 4 - Test

docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 390.154                Driver Version: 390.154                   |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla V100-SXM2...      Off  | 00000000:04:00.0 N/A |              N/A |
| N/A   41C    P0    N/A /  N/A |    0MiB /   16160MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0                    Not Supported                                       |
+-----------------------------------------------------------------------------+

 


Tags:

configure, nvidia, gpus, docker, ubuntu, comprehensive, ai, cuda, solutionwelcome, depth, configuring, tailored, developers, scientists, professionals, leverage, gpu, acceleration, containers, projects, scientific, computations, intensive, tasks, supported, linux, gtx, rtx, tesla, recommended, ecc, applications, multiple, typically, vm, exclusive, essentially, creating, layer, allows, unlimited, underlaying, deprecrated, quot, toolkit, distro, newer, install, https, docs, datacenter, native, container, html, mint, debian, repo, gpg, curl, fssl, github, io, libnvidia, gpgkey, sudo, dearmor, usr, keyrings, keyring, sources, deb, sed, tee, etc, apt, update, packages, lists, dependency, additional, installed, upgraded, newly, kb, archives, mb, disk, amd, fetched, selecting, previously, unselected, database, directories, currently, preparing, unpack, _, _amd, unpacking, tools_, base_, toolkit_, processing, triggers, libc, bin, configuration, ctk, runtime, info, loading, config, daemon, json, updated, restarted, rm, smi, persistence, disp, volatile, uncorr, temp, perf, pwr, usage, util, compute, sxm, mib, default, processes, pid,

Latest Articles

  • How To Upgrade Debian 8,9,10 to Debian 12 Bookworm
  • Linux dhcp dhclient Mint Redhat Ubuntu Debian How To Use Local Domain DNS Server Instead of ISPs
  • Docker dockerd swarm high CPU usage cause solution
  • Docker Minimum Requirements/How Efficient is Docker? How Much Memory Does Dockerd Use?
  • qemu-nbd: Failed to set NBD socket solution qemu-nbd: Disconnect client, due to: Failed to read request: Unexpected end-of-file before all bytes were read
  • apache2 httpd apache server will not start [pid 22449:tid 139972160445760] AH00052: child pid 23248 exit signal Aborted (6) solution Mint Debian Ubuntu Redhat
  • How to use the FTDI USB serial cable to RJ45 adapter to connect to the console on Cisco/Juniper Switch Router Firewall in Linux Ubuntu Debian Redhat
  • How To Setup Python3 in Ubuntu Docker Image for AI Deep Learning
  • How to Configure NVIDIA GPUs with Docker on Ubuntu: A Comprehensive Guide for AI Deep Learning CUDA Solution
  • Linux Ubuntu Mint how to check nameservers when /etc/resolv.conf disabled solution
  • Docker cannot work on other overlayfs filesystems such as ecryptfs won't start overlayfs: filesystem on '/home/docker/overlay2/check-overlayfs-support130645871/upper' not supported as upperdir
  • Linux How To Access Original Contents of Directory Mounted Debian Mint CentOS Redhat Solution
  • ecryptfs how to manually encrypt your existing home directory or other directory
  • How to Reset CIPC Cisco IP Communicator for CME CUCM CallManager
  • Internet Explorer Cannot Download File "Your security settings do not allow for this file to be downloaded." Security Settings Solution
  • Linux How To Upgrade To The Latest Kernel Debian Mint Ubuntu
  • Firefox how to restore and backup saved passwords and history which files/location
  • Linux How To echo as root solution to use tee permission denied solution Ubuntu Debian Mint Redhat CentOS
  • Linux how to keep command line bash process running if you are disconnected or need to logout of SSH remotely
  • Linux swapping too much? How to check the swappiness and stop swapping