Skip to main content

Run a GPU-accelerated application in a Docker container on a cloud server

Docker containers can be used on GPU cloud servers to flexibly manage GPU-accelerated applications without additional environment customization.

A containerized environment will allow:

  • optimally consume resources — you can run many applications on one server that require different environments;
  • avoid CUDA Toolkit versioning issues for your applications.

Servercore has Docker-ready images available to run GPU-accelerated applications in containerized environments:

  • Ubuntu 24.04 LTS 64-bit GPU Driver 535 Docker;
  • Ubuntu 24.04 LTS 64-bit GPU Driver 580 Docker;
  • Ubuntu 22.04 LTS 64-bit GPU Driver 535 Docker;
  • Ubuntu 22.04 LTS 64-bit GPU Driver 580 Docker.

Cloud server requirements

The cloud server should have:

  • server configuration with GPUs;
  • the image from which the server is created, with GPU drivers and Docker pre-installed;
  • network or local server disk larger than 40 GB.

Run a GPU-accelerated application in a Docker container on a server

  1. Run the pytorch-cuda sample in a Docker container.

  2. Create your own Docker image with CUDA.

1. Run the pytorch-cuda sample in a Docker container

Run PyTorch inside a GPU-enabled Docker container.

  1. Open the CLI.

  2. Make sure the GPU on the server is working correctly:

    nvidia-smi

    The response will list NVIDIA-SMI versions, drivers, and CUDA compatible with the current driver version but not installed on the system. For example:

    +-----------------------------------------------------------------------------------------+
    | NVIDIA-SMI 550.54.15 Driver Version: 550.54.15 CUDA Version: 12.4 |
    |-----------------------------------------+------------------------+----------------------+
    | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
    | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
    | | | MIG M. |
    |=========================================+========================+======================|
    | 0 Tesla T4 Off | 00000000:00:06.0 Off | 0 |
    | N/A 41C P8 10W / 70W | 0MiB / 15360MiB | 0% Default |
    | | | N/A |
    +-----------------------------------------+------------------------+----------------------+

    +-----------------------------------------------------------------------------------------+
    | Processes: |
    | GPU GI CI PID Type Process name GPU Memory |
    | ID ID Usage |
    |=========================================================================================|
    | No running processes found |
    +-----------------------------------------------------------------------------------------+
  3. Start the container from the NVIDIA Container Registry:

    sudo docker run --gpus all -it --rm nvcr.io/nvidia/pytorch:<pytorch_version>-py3 bash

    Specify <pytorch_version> — the version of PyTorch..

  4. Make sure the CUDA Toolkit is installed in the container and the GPU is available for computation:

    import torch

    print("CUDA Available: ", torch.cuda.is_available())
    print("Number of GPUs: ", torch.cuda.device_count())

    Example output:

    CUDA Available:  True
    Number of GPUs: 1
  5. Make sure the container has the CUDA Runtime 12.1 version installed, which is needed to run the current version of PyTorch:

    conda list | grep cud

    Example output:

    libcudnn9-cuda-12         9.1.1.17                      0    nvidia
    cuda-cudart 12.1.105 0 nvidia
    cuda-cupti 12.1.105 0 nvidia
    cuda-libraries 12.1.0 0 nvidia
    cuda-nvrtc 12.1.105 0 nvidia
    cuda-nvtx 12.1.105 0 nvidia
    cuda-opencl 12.3.101 0 nvidia
    cuda-runtime 12.1.0 0 nvidia

    The CUDA Runtime does not need to be installed on the server OS.

2. Create your own Docker image with CUDA

  1. Start the finished container:

    docker run --gpus all -it --rm  nvcr.io/nvidia/cuda:12.8.1-cudnn-devel-ubuntu24.04

    CUDA Toolkit, CUDA Runtime, and libcudnn compatible versions will be pre-installed in the container:

    cuda-cudart-12-8                12.8.90-1                   amd64        CUDA Runtime native Libraries
    cuda-nvcc-12-8 12.8.93-1 amd64 CUDA nvcc
    cuda-toolkit-config-common 12.8.90-1 all Common config package for CUDA Toolkit.
    libcudnn9-cuda-12 9.8.0.87-1 amd64 cuDNN runtime libraries for CUDA 12.8
  2. Install Python 3:

    apt update && apt -y install python3 python3-pip
    python3 -m pip config set global.break-system-packages true
    python3 -m pip install tensorflow
  3. Make sure the GPU is available in the Docker container:

    python3 -c "import tensorflow as tf; print(tf.reduce_sum(tf.random.normal([1000, 1000]))); gpu_available = tf.test.is_gpu_available(); print('GPU is availlable: ', gpu_available)"

    Example output:

    I0000 00:00:1743408862.613883     910 gpu_device.cc:2019] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 4287 MB memory:  -> device: 0, name: NVIDIA RTX A2000, pci bus id: 0000:00:06.0, compute capability: 8.6
    tf.Tensor(-1418.5072, shape=(), dtype=float32)
    Available GPUs: [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
  4. Exit the shell without interrupting the container: press Ctrl + P and then Ctrl + Q.

  5. Verify that the container is running:

    docker ps a

    Example output:

    CONTAINER ID   IMAGE                                                COMMAND                  CREATED          STATUS                      PORTS     NAMES
    20d557a37bdd nvcr.io/nvidia/cuda:12.8.1-cudnn-devel-ubuntu24.04 "/opt/nvidia/nvidia_…" 24 minutes ago Up 24 minutes nifty_shtern

    In the CONTAINER ID parameter column, copy the ID of the container you started in step 1.

  6. Create an image:

    docker commit <container_id> <image_tag>

    Specify:

    • <container_id> — The ID of the container you copied in step 5;
    • <image_tag> — Image tag.

    If the image has been created, the image hash will be output. Example output:

    sha256:a7ff970295e5dd37ef441fcf0462752715c95cece2729ddcc774a8aaa0773bce
  7. Create and run your own container from the image:

    docker run --rm -it <image_tag> bash

    Specify <image_tag>, which is the image tag you created in step 6.

    Here --rm is a flag that will remove the container after exiting the container's bash shell.