- What is CREODIAS?
- Computing & Cloud
- Data & Processing
- Pricing Plans
- Fight with COVID-19
- Examples of usage
- Processing Sentinel-5P data using HARP and Python
- EO Data Access (R)evolution
- Land cover classification using remote sensing and AI/ML technology
- AI-based satellite image enhancer and mosaicking tools
- Monitoring air pollution using Sentinel-5P data
- Species classification of forests
- Enabling AI / ML workflows with CREODIAS vGPUs
- Satellite remote sensing analyses of the forest
- Satellite-based Urban Heat Island Mapping on CREODIAS
- Old but gold - historical EO data immediately available and widely used on CREODIAS
- CREODIAS for emergency fire management
- AgroTech project as an example of how CREODIAS can be used for food and environmental research
- Monitoring Air Quality of Germany in Pre vs During COVID Lockdown Period
- EO4UA
- Common Agricultural Policy monitoring with Earth Observation
- Applications of CREODIAS data
- Meteorological data usage on the CREODIAS platform
- Building added value under Horizon Europe with CREODIAS
- CREODIAS: Introduction to SAR Sentinel-1 data
- Land subsidence and landslides monitoring based on satellite data
- Satellite imagery in support of the Common Agriculture Policy (CAP) and crop statistics
- Useful tools for data processing, available on CREODIAS platform
- CREODIAS for hydrological drought modelling
- CREODIAS for managing Urban Heat Islands
- CREODIAS for Digitising Green Spaces
- CREODIAS for Air Quality
- Advanced data processors on CREODIAS
- CREODIAS for your application
- Solutions for agriculture with CREODIAS
- Earth Observation data for Emergency response
- Security Applications with Satellite Data
- Climate Monitoring with Satellite Data
- Water Analysis on CREODIAS
- CREODIAS for land and agriculture monitoring
- Solutions for atmospheric analysis
- Example of tool usage
- Processing EO Data and Serving www services
- Processing and Storing EO
- Embedding OGC WMS Services into Your website
- GPU Use Case
- Using the EO Browser
- EO Data Finder API Manual
- Use of SNAP and QGIS on a CREODIAS Virtual Machine
- Use of WMS Configurator
- DNS as a Service - user documentation
- Use of Sinergise Sentinel Hub on the CREODIAS EO Data Hub
- Load Balancer as a Service
- Jupyter Hub
- Use of CREODIAS Finder for ordering data
- ESRI ArcGIS on CREODIAS
- Use of CEMS data through CREODIAS
- Searching, processing and analysis of Sentinel-5P data on CREODIAS
- ASAR data available on CREODIAS
- Satellite remote sensing analyses of the forest
- EO Data Catalogue API Manual
- Public Reporting Dashboards
- Sentinel Hub Documentation
- Legal Matters
- FAQ
- News
- Partner Services
- About Us
- Forum
- Knowledgebase
CUTTING EDGE
Sample Deep Learning Workflow Using WAW3-1 vGPU and EO DATA on Creodias
This article assumes that you have access to CloudFerro WAW3-1 infrastructure, which has Kubernetes support built-in (OpenStack Magnum module).
If your CREODIAS account has access only to CF2 infrastructure, please contact support to get access to WAW3-1.
This is a demonstration of using Deep Learning tools (TensorFlow / Keras) for performing a custom image classification on vGPU in WAW3-1.
The use of (v)GPU will speed up the calculations involved in Deep Learning. In this example, it will cut down processing time from a couple of hours to a couple of minutes.
You can also install and use the model in this article within a Docker environment. To install see Install TensorFlow on Docker Running on Creodias WAW3-1 vGPU Virtual Machine and to use see Sample Deep Learning Workflow Using TensorFlow Running on Docker on Creodias WAW3-1 vGPU Virtual Machine.
The task you are going to apply TensorFlow to is how to recognize which sets of images are cropped and which are not. In two sets of images shown below, the images on the left are not cropped while those on the right are cropped. The model you are going to develop in this article will be able to reach (more or less) the same conclusions as humans would for the same set of pictures.

Satellite image processing is a discipline of its own, utilizing various techniques. This article concentrates on absolute basics just to demonstrate the concept and a possible workflow when using Deep Learning on a vGPU machine.
Also, the model here developed is just an example. Using it in production would require extensive further testing. The model is not deterministic and will produce different results with each training.
What We Are Going To Do
Give thorough explanation of the Python code used in this process
Download data for testing and training
Copy the data and Python code to a GPU-enabled virtual machine
Install the dependencies required by the app (pandas and numpy).
Train and run the model against the data you downloaded from this article
Analyze the results
Benchmark the model on flavors vm.a6000.1 and vm.a6000.8 and show that on the latter, the process is up to five times faster
The Prerequisites
No. 1 Account
You need a Creodias hosting account with access to the Horizon interface: https://horizon.cloudferro.com/auth/login/?next=/.
No. 2 Create a new Linux VM with NVIDIA virtual GPU
You need to have a vGPU-enabled machine on the WAW3-1 cloud: How To Create a New Linux VM With NVIDIA Virtual GPU in the OpenStack Dashboard Horizon on Creodias
No. 3 Add a floating IP address to your VM
How to Add or Remove Floating IP’s to your VM on Creodias.
You will now be able to use that floating IP address for the examples in this article.
No. 4 TensorFlow installed
You also have to have TensorFlow installed. For instructions on how to do it, please refer to the following article: Install TensorFlow on WAW3-1 vGPU enabled VM on Creodias. This article is a direct continuation of that article.
No. 5 Ubuntu 20.04 on your local computer
This article assumes that your local computer is running on Ubuntu 20.04 LTS. You can, however, run this model from any other operating system provided you use the relevant commands for file operations, SSH access and so on.
Code Explanation
This section contains detailed explanation of this process and the Python code used for it. For the practical steps allowing you to replicate this workflow, please start with the section Practical workflow (see below). You will not have to copy the Python code yourself, a file with it is available to download for your convenience.
Step 1: Data Preperation
Data Preparation is the fundamental (and usually most time-involved) step in each Data Science related project. In order for our DL model to be able to learn we will follow the typical (supervised learning) sequence:
Organize a sufficiently large sample of data (here: a Sentinel-2 satellite images sample).
Tag/Label this data manually (by human). In our example we manually separated the images to “edges” and “noedges” categories, representing cropped and non-cropped images respectively.
Put aside part of this data as Train(+Validation) subset, which will be used to “teach” the model.
Put aside another subset as Test. This is a control subset that model never sees during the learning phase, and will be used to evaluate model quality.
The downloadable .zip file found later in this article is an already prepared (according to these steps) dataset. It contains
592 files of Train/Validate set (50/50 cropped/non-cropped images)
148 files of Test set (also 50/50 cropped/non-cropped).
Based on the folder and sub-folder names from this dataset Keras will automatically entail the labels, so it is important to keep the folder structure as it is.
The final step is doing necessary operations on the data so that it is a proper input for the model. Tensorflow will do a lot of this work for us. For example. using the image_dataset_from_directory function, each image is automatically labeled and converted to a vector/matrix of numbers: height x width x depth (RGB layer).
For your specific use case you might need to do various optimizations of data along this step.
import numpy as np
import tensorflow as tf
from tensorflow import keras
import pandas as pd
# DATA INGESTION
# -------------------------------------------------------------------------------------
# Ingest the Training, Validation and Test datasets from image folders.
# The labels (edges vs. noedges) are automatically inferred based on folder names.
train_ds = keras.utils.image_dataset_from_directory(
directory='./data/train',
labels='inferred',
label_mode='categorical',
validation_split=0.2,
subset='training',
image_size=(343, 343),
seed=123,
batch_size=8)
val_ds = keras.utils.image_dataset_from_directory(
directory='./data/train',
labels='inferred',
label_mode='categorical',
validation_split=0.2,
subset='validation',
image_size=(343, 343),
seed=123,
batch_size=8)
test_ds = keras.utils.image_dataset_from_directory(
directory='./data/test',
labels='inferred',
label_mode='categorical',
image_size=(343, 343),
shuffle = False,
batch_size=1)
Step 2: Defining and Training of the Model
Defining an optimal model is the art and science of Data Science. What we are showing here is merely a simple sample model and you should read more from other sources about creating models for real life scenarios.
Once the model is defined, it gets compiled and its training begins. Each epoch is the next iteration of tuning the model. These epochs are complex and heavy computing operations. Using vGPU is fundamental for Deep Learning applications, as it enables distributing micro-tasks over hundreds and thousands of cores, thus speeding up the process immensely.
Once the model is fit we will save it and reuse it for generating predictions.
# TRAINING
# -------------------------------------------------------------------------------------
# Build, compile and fit the Deep Learning model
model = keras.applications.Xception(
weights=None, input_shape=(343, 343, 3), classes=2)
model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics='accuracy')
model.fit(train_ds, epochs=5, validation_data=val_ds)
model.save('model_save.h5')
# to reuse the model later:
#model = keras.models.load_model('model_save.h5')
Step 3: Generating Predictions for the Test Data
Once the model has been trained, generating the predictions is a simple and much faster operation. In our case we will use the model to generate predictions for the test data as outlined before.
# GENERATE PREDICTIONS on previously unseen data
# -------------------------------------------------------------------------------------
predictions_raw = model.predict(test_ds)
Step 4: Summarizing the Results
In this step we take the actual labels (“edges” vs. “noedges” which represent “cropped” vs. “non-cropped” images) and compare to the labels which our model predicted.
We summarize the results in a Data Frame saved as a CSV file, which enables interpreting actual results in the test set vs. the predictions provided by the model.
Example of CSV output:

# SUMMARIZE RESULTS (convenience, alterantive approaches are available)
# -------------------------------------------------------------------------------------
# initialize pandas dataframe with file paths
df = pd.DataFrame(data = {"file_path": test_ds.file_paths})
class_names = test_ds.class_names # ["edges","noedges"]
# add actual labels column
def get_actual_label(label_vector):
for index, value in enumerate(label_vector):
if (value == 1):
return class_names[index]
actual_label_vectors = np.concatenate([label for data, label in test_ds], axis=0).tolist() # returns array of label vectors [[0,1],[1,0],...] representing category (edges/noedges)
actual_labels = list(map(lambda alv: get_actual_label(alv), actual_label_vectors))
df["actual_label"] = actual_labels
# add predicted labels column
predictions_binary = np.argmax(predictions_raw, axis=1) # flatten to 0-1 recommendation
predictions_labeled = list(map(lambda x: class_names[0] if x == 0 else class_names[1],list(predictions_binary)))
df["predicted_label"] = predictions_labeled
df.to_csv("results.csv", index=False)
Practical Workflow
This section contains practical steps which allow you to perform this workflow yourself. This is just an example and you can create a different workflow yourself.
Please revisit the Prerequisites section before undertaking the practical steps below.
Step 1: Loading and Formatting the Data
Disconnect from your virtual machine if you haven’t done it yet using the following command:
exit
Open your Internet browser and download the needed files:
Step 2: Copy the downloaded files to your virtual machine
For the purposes of this article, we will assume that your browser downloaded the above mentioned files to the Downloads folder in your home directory. If your browser is configured differently, modify the below instructions accordingly.
Open your terminal on your local machine and make the Downloads folder your current working directory:
cd Downloads
Use the scp command to copy the files to your virtual machine (replace 64.225.129.70 with the floating IP address of your machine):
scp ./data.zip ./deeplearning.py eouser@64.225.129.70:/home/eouser/
Step 3: Install additional required packages
Reconnect to your virtual machine using SSH (replace 64.225.129.70 with the floating IP address of your machine):
ssh eouser@64.225.129.70
Reenter your Python virtual environment (the one you created while following the previous article, Install TensorFlow on WAW3-1 vGPU enabled VM on Creodias).
source tensorflowenv/bin/activate
Now, install the required libraries:
pip install numpy
pip install pandas
Step 4: Prepare the files
Create the folder in which you will place the files required for the deep learning process - for the purposes of this article it will be called deeplearning:
mkdir deeplearning
Move the required files to that folder:
mv deeplearning.py data.zip deeplearning
Make the deeplearning folder your current working directory:
cd deeplearning
Unzip the file containing the data and remove it:
unzip data.zip
You can now remove the original zip archive:
rm data.zip
Use the ls command to verify that your deeplearning folder contains the following files:
directory data (content of the above mentioned ZIP archive)
text file deeplearning.py

Step 5: Execute the file with Python code
Now, execute the text file with the Python code using the following command:
python3 deeplearning.py
Once the operations have been completed, a text file results.csv (containing the results) and a file model_save.h5 (containing the just trained model) should have appeared in your current working directory - verify that they exist using the ls command.

Step 6: Download the results and the trained model to your local computer
In order to those files them to your local computer, first disconnect from your virtual machine using the following command:
exit
Now, make the folder in which you want to download that file your current working directory and make sure that it does not have a file named results.csv or model_save.h5 yet. You can now download the files using the following command (replace 64.225.129.70 with the floating IP address of your virtual machine):
scp eouser@64.225.129.70:/home/eouser/deeplearning/results.csv eouser@64.225.129.70:/home/eouser/deeplearning/model_save.h5 .
Your files should now be on your local machine:

Performance Comparison
This article has the TensforFlow installed directly on WAW3-1 cloud. There is a parallel article with the same example running on Docker (Sample Deep Learning Workflow Using TensorFlow Running on Docker on Creodias WAW3-1 vGPU Virtual Machine) and we are now going to compare running times from these environments, using the smallest and the biggest flavors for vGPUs, vm.a6000.1 and vm.a6000.8.
The table below contains the amount of time it takes for the execution of Python code to be completed. It was measured using the time command (“real value”). All tests were performed on the Creodias WAW3-1 cloud.
vm.a6000.1 | vm.a6000.8 | |
Docker used | 5m50.449s | 1m14.446s |
Docker not not used | 5m0.276s | 0m55.547s |
The whole process took then about 5 times less time on the vm.a6000.8 flavor than it took on the vm.a6000.1 flavor. There is a small penalty when using Docker, but that is expected.
This benchmark counts all phases of the execution of the Python code and not all of them may benefit from better hardware to the same degree.
What Can Be Done Next
You can also perform this workflow by installing directly on the Creodias WAW3-1 cloud. If you want to do so, please see the following article: Sample Deep Learning Workflow Using TensorFlow Running on Docker on Creodias WAW3-1 vGPU Virtual Machine.
Warning
The samples in this article might be non-representative and your milleage can vary. Use this code and the entire article as a starting point to conduct your own research.