Skip to main content
Create a cloud server with GPU

Create a cloud server with GPU

You can add GPUs (graphics processing units) to a cloud server — when you create a server or to an existing server.

GPUs are used as dedicated PCI devices inside the cloud server.

GPUs are available:

GPU Line and arbitrary GPU configurations can be used with a local or network boot disk. For cloud servers, only the NVIDIA® A100 40Gb or NVIDIA® A30 in the ru-7a pool segment can be used with a local disk.

If you need a server with a set of preconfigured tools and libraries for machine learning and data analysis, go with AI Marketplace.

Create a server with GPU

Use the instructions to Create a cloud server.

Select:

  • source — ready GPU optimized image: check the GPU optimized images checkbox to filter GPU optimized OS images. The images contain the drivers needed to work with GPUs. If you choose a different source, you must install the drivers on the server yourself to ensure stable NVIDIA® GPUs;
  • configuration — fixed GPU Line configuration or arbitrary configuration from 2 vCPUs.

Add a GPU to an existing cloud server

If the cloud server has an arbitrary configuration, GPUs can be added to it.

For cloud servers with local disk, only NVIDIA® A100 40Gb or NVIDIA® A30 can be added in the ru-7a pool segment.

  1. In the Control panel, on the top menu, click Products and select Cloud Servers.
  2. Open the server page → Configuration tab.
  3. Click Change Configuration.
  4. Ensure that a random configuration is selected in the Change Configuration block.
  5. Click Add GPU. If the server has 1 vCPU, the value will automatically change to 2 vCPUs.
  6. Select the GPU type.
  7. Specify the number of GPUs.
  8. Click Save and reboot.
  9. If the server is not created from an off-the-shelf GPU-optimized image, install the drivers on the server yourself to ensure stable operation of NVIDIA® GPUs.

Available GPUs

NVIDIA® A100 40Gb

NVIDIA® A100 40Gb NVLink (on request)
NVIDIA® A100 80GbNVIDIA® Tesla T4NVIDIA® A30NVIDIA® A2
(updated analog
NVIDIA® Tesla T4)
NVIDIA® GTX 1080NVIDIA® RTX 2080 TiNVIDIA® RTX 4090NVIDIA® RTX 6000 Ada
(analog L40)
NVIDIA® A2000
(RTX 3060 analog)
NVIDIA® A5000
(RTX 3080 analog)
Memory40 GB.
HBM2
80 GB.
HBM2
16 GB
GDDR6
24 GB.
HBM2
16 GB
GDDR6
8 GB.
GDDR5X
11 GB.
GDDR6
24 GB.
GDDR6X
48 GB.
GDDR6X
6 GB.
GDDR6
24 GB.
GDDR6
CUDA kernels6192691225603804128025604352163841817633288192
Tensor kernels43243232022440544512568104256

You can view the current list of GPUs in the control panel: from the top menu, click ProductsCloud Servers → click Create Server.

To see GPU availability in the regions, see the GPU availability matrix for cloud servers.

NVIDIA® A100 40Gb

Offers maximum performance for AI, HPC and data processing. Suitable for deep learning, scientific research and data analytics.

Based on Ampere® architecture, with 40GB HBM2 memory and up to 1.5GB/s bandwidth. See NVIDIA® documentation for detailed specifications.

Fixed GPU Line configurations are available from 1 to 8 GPUs × 40 GB, with vCPUs from 6 to 48, RAM from 87 to 700 GB.

Random configurations are 1 to 8 GPUs × 40 GB, with vCPUs from 2 to 32, RAM from 512 MB to 256 GB.

You can combine two NVIDIA® A100 40Gb GPUs using NVLink technology.

NVLink accelerates the communication speed of GPU interconnect compared to PCIe. GPUs interconnected by NVLink allow more memory to be utilized and improve server performance for complex computations, such as training large language ML models.

NVLink works with NVIDIA® A100 40Gb GPUs — GPUs based on Ampere® architecture, with 40GB HBM2 memory and up to 1.5GB/s bandwidth. See detailed NVIDIA® A100 40Gb specifications and a description of NVLink technology in the NVIDIA® documentation.

NVIDIA® A100 40Gb NVLink available upon request — create a ticket.

NVIDIA® A100 80Gb

Offers maximum performance for AI, HPC, and data processing, as well as large memory capacity for compute-intensive tasks. Suitable for deep learning, scientific research, and data analytics.

Based on Ampere® architecture, with 80GB HBM2 memory and up to 1.5GB/s bandwidth. See detailed specifications in NVIDIA® documentation.

Fixed GPU Line configurations are available from 1 to 8 GPUs × 80 GB, with vCPUs from 12 to 96, RAM from 128 to 1000 GB, and local disk from 128 GB to 6.88 TB.

NVIDIA® Tesla T4

Suitable for Machine Learning and Deep Learning, inference, graphics and video rendering. Works with most AI frameworks and is compatible with all types of neural networks.

Based on the Turing® architecture, with 16GB GDDR6 memory and up to 300GB/s of bandwidth. See detailed specifications in NVIDIA® documentation.

In fixed GPU Line configurations, 1 to 4 GPUs × 16 GB are available, with vCPUs from 4 to 24, RAM from 32 to 320 GB.

Random configurations are 1 to 4 GPUs × 16 GB, with vCPUs from 2 to 32, RAM from 512 MB to 256 GB.

NVIDIA® A30

Suitable for AI-inference, HPC, language processing, conversational artificial intelligence, recommender systems.

Based on Ampere® architecture, with 24GB HBM2 memory and up to 933GB/s of bandwidth. See detailed specifications in NVIDIA® documentation.

In fixed GPU Line configurations, 1 to 2 GPUs × 24 GB are available, with vCPUs from 16 to 48, RAM from 64 to 320 GB.

Random configurations are 1 to 2 GPUs × 24 GB, with vCPUs from 2 to 32, RAM from 512 MB to 256 GB.

NVIDIA® A2

An entry-level GPU. Suitable for simple inference, video and graphics, Edge AI (edge computing), Edge video, mobile cloud gaming.

Based on the Ampere® architecture, with 16GB GDDR6 memory and up to 200GB/s bandwidth. See NVIDIA® documentation for detailed specifications.

In fixed GPU Line configurations, 1 to 4 GPUs × 16 GB are available, with vCPUs from 12 to 48, RAM from 32 to 320 GB.

Random configurations are 1 to 4 GPUs × 16 GB, with vCPUs from 2 to 32, RAM from 512 MB to 256 GB.

NVIDIA® GTX 1080

High-performance and energy-efficient GPU. The solution is realized with FinFET technology and GDDR5X memory. Dynamic load balancing helps to divide tasks so resources don't sit idle waiting. Maximizes performance for display, VR, ultra high-resolution settings, and data processing.

Based on Pascal® architecture, with 8GB GDDR5X memory and up to 320GB/s of bandwidth. See detailed specifications in NVIDIA® documentation.

In fixed GPU Line configurations, 1 to 8 GPUs × 8 GB are available, with vCPUs from 8 to 28, RAM from 24 to 96 GB.

Random configurations are 1 to 8 GPUs × 8 GB, with vCPUs from 2 to 32, RAM from 512 MB to 256 GB.

NVIDIA® RTX 2080 Ti

High-performance GPU for demanding graphics tasks. Suitable for high-resolution video processing, 3D modeling, rendering, and photo processing. Also suitable for training neural networks, performing complex AI computations, and processing large amounts of data.

Based on the Turing® architecture, with 11GB GDDR6 memory and up to 616GB/s of bandwidth. See detailed specifications in NVIDIA® documentation.

In fixed GPU Line configurations, 1 to 4 GPUs × 11 GB are available, with vCPUs from 2 to 48, RAM from 32 to 320 GB.

Random configurations are 1 to 4 GPUs × 11 GB, with vCPUs from 2 to 32, RAM from 512 MB to 256 GB.

NVIDIA® RTX 4090

The highest performing GPU in the GeForce series. Suitable for professional design and 3D modeling, video, rendering, ML tasks (model training and inference), LLM models, scientific and engineering computing (e.g., climate modeling or bioinformatics).

Based on Ada Lovelace® architecture, with 24GB GDDR6X memory and up to 1008GB/s of bandwidth. See detailed specifications in NVIDIA® documentation.

In fixed GPU Line configurations, 1 to 4 GPUs × 24 GB are available, with vCPUs from 4 to 64, RAM from 16 to 356 GB.

In random configurations, 1 to 4 GPUs × 24 GB, with vCPUs from 2 to 32, RAM from 4 to 256 GB.

NVIDIA® RTX 6000 Ada

Professional GPU for computing and graphics power. Suitable for ML tasks, rendering, scientific computing and high-performance visualization.

Based on Ada Lovelace® architecture, with 48GB GDDR6X memory and up to 960GB/s of bandwidth. See detailed specifications in NVIDIA® documentation.

In fixed GPU Line configurations, 1 to 4 GPUs × 48 GB are available, with vCPUs from 12 to 96, RAM from 64 to 450 GB, and local disk from 64 GB to 2 TB.

NVIDIA® A2000

Power-efficient GPU for compact workstations. Suitable for AI, graphics and video rendering.

Based on Ampere® architecture, with 6GB GDDR6 memory and up to 288GB/s of bandwidth. See NVIDIA® documentation for detailed specifications.

In fixed GPU Line configurations, 1 to 4 GPUs × 6 GB are available, with vCPUs from 6 to 24, RAM from 16 to 320 GB.

Random configurations are 1 to 4 GPUs × 6 GB, with vCPUs from 2 to 32, RAM from 512 MB to 256 GB.

NVIDIA® A5000

A versatile GPU, suitable for any task within its performance limits.

Based on Ampere® architecture, with 24GB GDDR6 memory and up to 768GB/s of bandwidth. See detailed specifications in NVIDIA® documentation.

In fixed GPU Line configurations, 1 to 2 GPUs × 24 GB are available, with vCPUs from 8 to 48, RAM from 32 to 320 GB.

Random configurations are 1 to 2 GPUs × 24 GB, with vCPUs from 2 to 32, RAM from 512 MB to 256 GB.