CloudFerro GPU spring

Spring is in full swing - the leaves are unfolding, and CloudFerro is installing new GPUs.

Not so long ago, H100 94GB cards were installed at the WAW4-1 cloud region. They are available as flavors with 1, 2 or 4 cards per instance in a passthrough mode.

This week L40S 48GB cards were installed and shared for customers in this region. They are available as shared virtualized flavors with part of card 1/8, 1/4, 1/2 or full card per instance.

This region is also waiting for the installation of H200 141GB cards that will soon be available to our users, similarly to H100 94GB in a passthrough mode, with 1, 2, 4 per instance and an additional flavor with 8 cards per instance.

This expansion will extend existing CloudFerro GPU infrastructure which now allows users to easily deploy GPU demanding workloads, such as AI (both Large Language Models and Deep Learning) and image processing.

If we compare core statistics of those GPUs as memory amount or its bandwidth and general TFLOPS (Tera Floating-Point Operations Per Second) performance, H100 and H200 looks superior. However, if we take a closer view, including some other parameters, such as the number of CUDA (Compute Unified Device Architecture) cores, then we can see that other cards are a very good choice for many tasks. For example, L40S cards are optimized for 3D and have more CUDA cores than even most advanced H200 cards.

How to prepare and run cloud GPU resources?

Here is a summary of documentation describing how to run and prepare Cloud resources using GPU. Most documents refer to A6000 and/or L40S cards, but those steps are applicable to H100 and H200 cards.

We provide GPUs in two modes – virtualized (vGPU) or in a passthrough mode.

vGPU – the instance gets a part of the card with virtualization software. Specific driver version must be used. It is provided by CloudFerro within a dedicated image. To create such instances, use flavors with name starting with “vm.”
Passthrough – the instance gets one or more cards as PCIe devices for direct exclusive access for instance lifetime. To create such instances, use flavors with name starting with “gpu.”

To create a virtual machine instance with a virtualized GPU, please follow this documentation.

How To Create a New Linux VM With NVIDIA Virtual GPU in the OpenStack Dashboard Horizon

Current documentation release mentions A6000 flavors at WAW3-1 cloud region, but it may be applied at all regions where flavors named “vm.*” are selectable.

To create a virtual machine instance with a passthrough GPU, please follow this documentation:

How to create a Linux VM and access it from Linux command line on CREODIAS.

As mentioned before, please use the flavors named “gpu.*”.

With an instance created, you will not be able to use GPU immediately. To take advantage of the GPU, install NVidia drivers according to this documentation:

How to install cuda nvidia drivers for Ubuntu.

GPU instances in Kubernetes

GPU instances may also be used in Kubernetes cluster created with OpenStack Magnum.

For this scenario, please follow this document: Deploying vGPU workloads on CREODIAS Kubernetes

Common data processing cases

We provide a set of other dedicated documents for common data processing cases:

Install TensorFlow on vGPU enabled VM

Install TensorFlow on Docker Running on CREODIAS vGPU Virtual Machine

Sample Deep Learning workflow using vGPU and EO DATA

How to start using GPUs at CREODIAS?

If you already have an account at CREODIAS, you can just follow the above tips to deploy your GPU workloads.

If you are not a CREODIAS user yet, please start with registering – see this document: Registration and Setting up an Account. A full welcome documentation can be found here: Welcome to CREODIAS Documentation.

Author: Mateusz Ślaski, Sales Support Engineer at CloudFerro