Storage
CREODIAS users have different storage options with different prices, IO performance, access speed and data resilience. Users can choose configurations that best suite their projects. Worth noting are new, powerful configurations with local storage. Users can benefit from more than 10x better IO performance and 10x lower latency, comparing to the previously available solutions.
From this article you will learn about features and performance of different types of storage.
From the service point of view, you can choose on CREODIAS:
- Volume storage (SSD or HDD network storage) - if configured, it can be used to boot a VM
- VM related storage (SSD network storage) which you can provide only together with VM and it is used as a default system disc. This storage is physically identical do SSD Volume Storage;
- Object Storage which is a mix of storage space and metadata with a special protocol to access the data;
- Local NVMe storage for DS servers (each DS server receives two identical very fast NVME PCIe drives);
- Local ephemeral storage for HMD virtual machine in which a physical very fast NVMe drive is attached to a VM
From the physical point of view these storage options correspond to three storage media types:
- Network HDD Ceph storage – this is a cheap, reliable, very resilient and immensely big storage pool. This storage is available both as block (volumes) and object (S3) storage. Both storage types have different points of access with different costs and performance.
- Network SSD Ceph storage – a fast, reliable and resilient storage. It is a default storage media for VMs. VM related storage and volume SSD storage are stored on this type of media.
- Local compute storage (usually NVME) – this storage is located inside a server that hosts your VM on a very fast disc. It means that when the computer server encourages hardware malfunction, the storage media becomes inaccessible or, on very rare occasions, you may experience data loss. The NVMe drive that hosts data for HMD configuration is a single, very reliable high-performance drive with MTBF>2M h and up to 400k IOPS.
The two drives in DS server are usually NVMe SSD drives that use a passthrough mechanism to present the drives to the client OS. We encourage our users to configure the drives in RAID1 software RAID (mdma) to introduce some data protection against hardware malfunctions.
Network drives are more reliable by a few orders of magnitude because they are built from hundreds of storage servers and thousands of discs. The obvious downside of any network storage is a need to transport the data over the network. As a result, in comparison with local solutions, it takes more time to get the response from the network storage medium.
In scenarios where the data can be accessed or written in many queues, the network storage offers a substantial advantage of having hundreds of individual drives to write parallelly. This increases the IO and bandwidth performance exponentially. That is why, the network storage is ideal for parallel operations.
The local storage is dependent on physical media performance and cannot rely on thousands of drives to boost performance. In HMD and DS solution we use very fast local NVMe drives. For this reason those configurations are ideal in scenarios that need very low latency and very hight IO.
Here are some examples of results we performed on VM on CREODIAS.
Read
Network HDD Storage single que IOPS performance 4k blocks - 1120 IOPS
Network HDD Storage multi que IOPS performance 4k blocks – 44000 IOPS
Network HDD Storage maximum bandwidth on 4M blocks - 2169 MiB/s
Network SSD Storage single que IOPS performance 4k blocks - 1500 IOPS
Network SSD Storage multi que IOPS performance 4k blocks – 47000 IOPS
Network SSD Storage maximum bandwidth on 4M blocks – 3269 MiB/s
Local HMD storage single que IOPS performance 4k blocks - 34500 IOPS
Local HMD storage multi que IOPS performance 4k blocks – 337000 IOPS
Local HMD storage maximum bandwidth on 4M blocks - 2963 MiB/s
Write
Network HDD Storage single que IOPS performance 4k blocks - 96 IOPS
Network HDD Storage multi que IOPS performance 4k blocks – 2948 IOPS
Network HDD Storage maximum bandwidth on 4M blocks - 260 MiB/s
Network SSD Storage single que IOPS performance 4k blocks - 650 IOPS
Network SSD Storage multi que IOPS performance 4k blocks – 6006 IOPS
Network SSD Storage maximum bandwidth on 4M blocks – 550 MiB/s
Local HMD storage single que IOPS performance 4k blocks - 25000 IOPS
Local HMD storage multi que IOPS performance 4k blocks – 270000 IOPS
Local HMD storage maximum bandwidth on 4M blocks - 1371Mib/s
All the above tests were carried out on 8 vCPU HMD VM. The multi queue performance is very CPU dependant as numbers of vCPU correspond to the number of maximum storage and network operations, and if we used larger VM we would obtain better results for the network storage. For big blocks and high queue depth, the vCPU and network may be a limiting factor of IOPS/Bandwitch performance, not the storage medium itself.
It is important to know that CEPH storage is designed in a way that practically eliminates a risk of data loss. In this case, natural disasters or human errors are more probable by a few orders of magnitude than any hardware failure leading to data corruption.
Find out more in recorded webinar How to choose the right computing resources for your project on CREODIAS.
Cloud resources prices, in particular storage prices, are available in our price-list
VM related storage
Description
VM related-storage is a fast SSD network storage connected to individual Virtual Machines. It is directly available to the VM without the need for mounting or connecting network shares. The quantity of VM related storage reserved for a VM depends on the VM Flavor selected.
Performance
VM related storage is fast – it is based on performant Solid State Drives.
Usage
VM storage can be used for fast, temporary or permanent data storage within a VM.
Provisioning
VMs come with VM storage included. The quantity of VM storage depends on the VM Flavor.
Limitations
VM storage is closely associated with a given VM which has exclusive access to this type of storage. Once the VM is terminated, its VM storage disappears.
VM ephemeral NUMe in HMD line
Description
VM ephemeral local NVME storage is only used in HMD VMS (“D” stands for local disk) in which you get part of physical, very fast NVMe drive attached to one VM. The NVMe drive that hosts data for HMD configuration is a single, very reliable, high performance drive with MTBF>2M h and up to 400k IOPS. We designed it as ephemeral storage, to expressly underline the dangers of losing the data and we encourage clients to back up the data stored on such disk or treat it as a cache only. VM related-storage is a fast solid state SSD storage connected to individual Virtual Machines. It is directly available to the VM without the need for mounting or connecting network shares. The quantity of VM related storage reserved for a VM depends on the VM Flavor selected.
Performance
This storage can give up to 35k IOPS in single queue in read and 25k IOPS in write. Multi queue performance can reach about 350k IOPS for read and 250k IOPS for write. The bandwidth performance is 3000 MiB/s for read and 1371 MiB/s for write
Usage
This type of storage should be used in all scenarios that require very high performance. HMDs are designed for applications such as data processing requiring fast local cache, control nodes for Kubernetes clusters with fast storage for etcd, very fast data entry, handling events from IoT devices, very fast saving of calculation results, hosting nonrelational databases.
Provisioning
HMD VMs come with VM storage included. The quantity of VM storage depends on the VM Flavor.
Limitations
VM storage is closely associated with a given VM which has exclusive access to this type of storage. Once the VM is terminated, its VM storage disappears. If the instance or the compute server on which the instance is running experience a failure, is deleted, goes into error state or needs to be moved to different server all the data may be lost. The data may be lost under any of the following events:
- Physical hard disk failure
- Server (hosting the instance) failure
- Instance terminations
- Instance failure or migration
- Server (hosting the instance) reboot
- Therefore, do not relay on instance for storing valuable, long term data.
Volume storage
Description
This type of storage consists of network Volume Storage that can be attached to VMs as block devices to dynamically extend their storage capabilities. Volume Storage are independent from VMs, they can be easily moved from one VM to another. Users may take snapshots of Volume Storage to be able to revert to their ‘frozen’ state later. Their size is limited only by the size of available storage space; Volumes can also be resized without unmounting. Volume-based VMs can be easily migrated between servers. Volume Storage can be encrypted if a User requests such an option. It is also possible to make a live backup copy of a volume.
Performance
Volume Storage is implemented as a distributed, redundant, highly available storage cluster with separate HDD and SSD tiers. The SSD tier provides high performance both in terms of transfer bandwidth and IOPS. The HDD tier provides cost-effective high capacity magnetic storage scalable to hundreds of Terabytes and beyond. Storage pools can be local to the computing resources or can be placed in remote locations (Warsaw WAW-2 or Frankfurt).
Usage
Volume Storage can be used as high capacity, high availability, scalable long term storage independent of VMs. SSD volumes should be selected for applications that require high performance in random access operations, such as databases and transactional systems.
HDD volumes are best used for high capacity file or media storage applications that are less demanding on random-access performance.
Both SSD and HDD volumes may be used as base /root storage VMs.
Provisioning
Volume Storage can be provisioned from the Cloud Dashboard. Users can select the storage tier (SSD or HDD) and setup volume attributes such as name, description and size. They can also select whether the Volume Storage should be encrypted or not. Volumes can be created empty, as a copy of another volume or containing a bootable operating system image. Volume Storage can be also purchased in Fixed Term mode for longer periods of time.
Once a volume has been created, it can be attached to a running VM. If the volume contains an OS image, a Virtual Machine can be booted directly from it.
Billing
Volume Storage are billed per available GBytes of storage space per month (or longer period) or hour. It can be bought either in Per Usage mode or for Fixed Terms.
Object storage
Description
Object Storage is a scalable storage for objects/files with HTTP REST interface. All objects/files operations are performed via REST API. Objects/files can be organized into buckets which act as standard file directories. User can also define access policy for buckets and objects/files. The API is compatible with Amazon S3 so existing AWS S3 tools can be used to manage objects/files and buckets on the storage. Users can also manage objects/files in an easy way via the Cloud Dashboard. The storage can be accessible from the public Internet and from VMs.
Usage
Object Storage can be used when communication between different Projects or Domains is required or when data is to be made available for the outside world via Internet.
Provisioning
Like Volume Storage.
Limitations
- It is highly advisable to put not more than 1 Mil (1 000 000) objects into one bucket (container).
- More objects makes listing of the objects very inefficient.
- Single file should have not more than 5GB using S3FS.
We suggest to create many buckets with small amount of objects instead of small amount of buckets with many objects.
Backup Service
The CREODIAS Platform infrastructure consists of several Data Centers - main T-Mobile Piekna DCand separate offsite locations totally independent from the main DC allowing offsite secure disk-based backups. All available localizations are fully secure Tier III+ Data Centers connected by a dedicated redundant Nx10Gbps WAN connections.
Several services are available to perform backup functionalities:
- Volume Storage in the remote DC - The storage volume service offers HDD volumes located in the backup DC. Such volumes can be used by customers to run a custom backup mechanism of their choice.
- Backup of Volume Storage - Backups of User’s persistent volume data are performed using the OpenStack Cinder Backup module with the Ceph Backup driver.
The functionality offered includes:
- Full and incremental, backups of selected volumes;
- User-scriptable filesystem and application quiescing for Linux and Windows guests to guarantee consistent backups at the filesystem and application level (guarantees consistent database backups);
- Snapshot-based backups of live-mounted volumes.
Backups can be launched manually from the Cloud Dashboard Volumes tab or programmatically via the REST API or OpenStack command line.
There is also a scheduler to allow for automated periodic backup schemes and backup rotation. Tenants are billed for the backup storage space used according to the Price List.
Requirments and limitations
The backup system allows for uninterrupted functioning of the User’s VMs, operating systems and applications.
A guest system agent will be preinstalled in every VM in order to allow for filesystem and application quiescing during backup. Standard quiescing consists of flushing all buffered data to disk before performing the snapshot necessary for data backup and pausing the VM-s write activities for the duration of the snapshot. Users may define custom quiescing activities to ensure backup consistency for their applications (ex. databases). This may cause a few seconds freeze to the VM being backed-up.
Thanks to the usage of incremental backups, only storage blocks that have changed since the previous backup need to be copied. Blocks are being compressed before being stored. Together, this allows for efficient usage of WAN backup bandwidth and storage space.
Billing
The Remote Storage and Backup as a Service services are billed according to GBytes used.
Internet Import-Export
Description
The simplest network services (HTTP/FTP/SFTP etc.) run directly from User’s VMs are a simplest and most common mechanism to import/export data between User's Environment and external world. Additionally the Object Storage with REST interface (accessible from Internet) can also be used for such purpose.
Billing
The amount of data transferred to/from the Internet in all the above cases is measured and billed according to the appropriate Price List.
Data Import-Export
Description
The simplest network services (HTTP/FTP/SFTP etc.) run directly from User’s VMs are a simplest and most common mechanism to import/export data between User's Environment and external world. Additionally the Object Storage with REST interface (accessible from Internet) can also be used for such purpose.
Billing
The amount of data transferred to/from the Internet in all the above cases is measured and billed according to the appropriate Price List.