Install drivers on a cloud server with GPU
This is an instruction with an example of how to install drivers on a cloud server created from a pre-built image Ubuntu 24.04 LTS 64-bit.
To ensure stable operation of the NVIDIA® GPU on a cloud server with GPU, you need to install drivers.
If you created a cloud server from a pre-built image that is GPU-optimized, the drivers are already installed, and no additional installation is required. GPU-optimized pre-built images include:
- Ubuntu 24.04 LTS 64-bit GPU Driver 535;
- Ubuntu 24.04 LTS 64-bit GPU Driver 535 Docker;
- Ubuntu 24.04 LTS 64-bit GPU Driver 580;
- Ubuntu 24.04 LTS 64-bit GPU Driver 580 Docker;
- Ubuntu 22.04 LTS 64-bit GPU Driver 535;
- Ubuntu 22.04 LTS 64-bit GPU Driver 535 Docker;
- Ubuntu 22.04 LTS 64-bit GPU Driver 580;
- Ubuntu 22.04 LTS 64-bit GPU Driver 580 Docker;
- Data Science VM (Ubuntu 22.04 LTS 64-bit);
- Data Analytics VM (Ubuntu 22.04 LTS 64-bit).
Install drivers
-
Install the
ubuntu-drivers-commonpackage:sudo apt install -y ubuntu-drivers-common alsa-utils -
View the recommended driver version:
sudo ubuntu-drivers devicesA list of versions will appear in the response. The recommended version will be marked as
recommended. Copy the recommended version.Example for an NVIDIA® Tesla T4 GPU with the recommended version
nvidia-driver-550:== /sys/devices/pci0000:00/0000:00:06.0 ==modalias : pci:v000010DEd00001EB8sv000010DEsd000012A2bc03sc02i00vendor : NVIDIA Corporationmodel : TU104GL [Tesla T4]manual_install: Truedriver : nvidia-driver-450-server - distro non-freedriver : nvidia-driver-535-server - distro non-freedriver : nvidia-driver-470-server - distro non-freedriver : nvidia-driver-470 - distro non-freedriver : nvidia-driver-550 - third-party non-free recommendeddriver : nvidia-driver-418-server - distro non-freedriver : xserver-xorg-video-nouveau - distro free builtin -
Optional: verify that the selected driver version is higher than the minimum compatible version for the cloud server's GPU architecture:
sudo apt-cache search nvidia-driver-*A list of compatible driver versions will appear in the response. You can view the GPU architecture in the Graphics Processors (GPU) guide, and the correspondence between the driver version and architecture in the CUDA Compatibility documentation by NVIDIA®.
-
If the GPU architecture is Pascal (for example, for an NVIDIA® GTX 1080), add the NVIDIA® Personal Package Archive repository to the cloud server:
sudo add-apt-repository ppa:graphics-drivers/ppa -y -
Install the kernel headers:
sudo apt updatefor kernel in $(linux-version list); do apt install -y "linux-headers-<kernel-version>"; doneSpecify
<kernel-version>— the kernel version. You can view the list of kernel versions using theapt-cache search linux-imagecommand. -
Install the driver:
sudo apt install -y <driver_version>Specify
<driver_version>— the driver version you copied in step 3.Example of installing the recommended version
nvidia-driver-550for an NVIDIA® Tesla T4 GPU:sudo apt install -y nvidia-driver-550 -
Verify that the driver is installed and working:
nvidia-smiThe NVIDIA-SMI and driver versions will appear in the response. Example output:
+-----------------------------------------------------------------------------------------+| NVIDIA-SMI 550.54.15 Driver Version: 550.54.15 CUDA Version: 12.4 ||-----------------------------------------+------------------------+----------------------+| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC || Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. || | | MIG M. ||=========================================+========================+======================|| 0 Tesla T4 Off | 00000000:00:06.0 Off | 0 || N/A 41C P8 10W / 70W | 0MiB / 15360MiB | 0% Default || | | N/A |+-----------------------------------------+------------------------+----------------------++-----------------------------------------------------------------------------------------+| Processes: || GPU GI CI PID Type Process name GPU Memory || ID ID Usage ||=========================================================================================|| No running processes found |+-----------------------------------------------------------------------------------------+ -
Open the configuration file for the
unattended-upgradespackage, which handles security updates:nano /etc/apt/apt.conf.d/50unattended-upgrades -
Disable package updates for NVIDIA®. To do this, add the following block to the file:
Unattended-Upgrade::Package-Blacklist {"linux-";"nvidia-";}; -
Exit the
nanotext editor and save your changes: press Ctrl+X, then Y+Enter. -
Optional: pin the kernel version to disable kernel updates. Updating the kernel version may cause errors in GPU driver operation.
Pin the kernel version
In pre-built images with pre-installed drivers, except for Data Analytics VM (Ubuntu 22.04 LTS 64-bit) and Data Science VM (Ubuntu 22.04 LTS 64-bit), the kernel version is already pinned.
During installation, drivers are compiled with the source code of the current kernel version's headers. Changing the kernel version leads to GPU driver failure. In that case, the following error may appear in the output of the nvidia-smi command:
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.
To disable kernel updates, pin the kernel version in the settings of the apt package manager. After pinning it, you will be able to update the kernel version.
-
Open the CLI.
-
Create the
pin-linux-kernel-nvidia-dkmsfile in the/etc/apt/preferences.ddirectory to pin the versions of thelinux-headersandlinux-image:cat <<EOF > /etc/apt/preferences.d/pin-linux-kernel-nvidia-dkmsPackage: linux-image-*Pin: version *Pin-Priority: -1Package: linux-headers-*pin: version *Pin-Priority: -1EOF
Update the kernel version after pinning
Once the kernel version is pinned, you cannot update it. To download security updates, performance improvements, and new features, remove the kernel version pinning file and update the version.
-
Open the CLI.
-
Delete the file you created to pin the kernel version:
rm /etc/apt/preferences.d/pin-linux-kernel-nvidia-dkms -
Update the kernel version:
apt install linux-image-<kernel-version>Specify
<kernel-version>— the kernel version. You can view the list of kernel versions using theapt-cache search linux-imagecommand. -
Install the kernel headers:
apt install linux-headers-$(uname -r)After installing the kernel headers, the
dkmsutility will launch, which will automatically recompile the NVIDIA modules for the new kernel version.