Land cover classification using remote sensing and AI/ML technology


Although the terms “land usage” and “land cover” are frequently used interchangeably, each term has a distinct definition. The term “land cover” describes a material that covers the surface of the ground, such as vegetation, urban infrastructure, water, bare soil, etc. Land cover identification creates the baseline data for tasks like thematic mapping and change detection analyses. The term “land use” describes the function that a piece of land performs, such as agriculture, wildlife habitat, or recreation.

Agronomists and agricultural organizations responsible for land management employ accurate land cover information in a variety of applications, including hydrological modeling, environmental management plans, effective infrastructure planning, and agronomists. In today’s rapidly urbanized world, where “55% of the world’s population lives in urban areas, a proportion that is expected to increase to 68% by 2050″ (World Urbanization Prospects, 2018), accurate and up-to-date land cover information becomes vital.

The latest updated land cover information can provide support to decision makers when responding to issues and challenges hampering effective urban governance. This article will cover EOfactory’s capability of utilising remote sensing and AI/ML technology for automatic land cover classification over satellite imageries.

Study area

In this article, two cities were selected as an area of study for land cover classification: Berlin, and Warsaw. A different area of study will have different urban and vegetational characteristics considering Berlin is located in Germany, and Warsaw is the capital city of Poland. This will let us take a look at the performance of EOfactory’s AI/ML model applied over two different areas.

Geographical overview of Warsaw, Poland

Geographical overview of Berlin, Germany

Data Sets and Pre-processing

Sentinel-2 images (with atmospheric corrections applied) of 6 cities based in Germany, Poland, and the Philippines with 10 meter resolution were obtained through EOfactory’s built-in integration with the Sentinel Hub. We acquired 3 Sentinel-2 images from 2020 to 2022 for each of these cities: Hamburg, Berlin, Warsaw, Wroclaw, Manila, and Butuan. One of the challenges is obtaining cloud-free images of Manila and Butuan since it is a common occurrence for satellite imageries to contain heavy clouds in the Asia region, and in order to acquire an image under clear sky conditions it’s generally recommended to look for images captured during the dry season.

The processes of clipping and mosaicking are included in the download process using EOfactory’s built-in Sentinel integration. There are no additional steps required after the images are completely downloaded. In total, there are 18 Sentinel-2 mosaics used in this process.

Some examples of Sentinel-2 mosaics of Berlin and Warsaw used for input datasets

EOfactory’s built-in color enhancement

It has come to our observations that the brightness and color contrast of an original Sentinel-2 image is not good enough to differentiate each object. Therefore, a process to enhance the image color to get a better contrast is required. EOfactory’s built-in integration does it so that the images are automatically enhanced in color. The figure below describes the comparison between a Sentinel-2 image obtained directly from Copernicus Hub, and an image obtained through EOfactory’s integration. 

Comparison of color between original Sentinel-2 image (left) and after being applied EOfactory’s built-in color enhancement (right)

Land Cover Classification

Training datasets

Training data for any AI/ML-powered analysis is a key input to algorithm that comprehends from such data and memorizes the information for future prediction. EOfactory has a built-in architecture to support and run AI/ML algorithms, and training data is the only source that will help your AI model to gain useful information from the data, and take crucial decisions.

For this study, the classified land cover is categorized into 4 classes:

  • Built-up area
  • Vegetation
  • Water
  • Vacant land (bare earth)

For each class, an average of 800 training labels were collected over 3 years of Sentinel-2 images of Poland, Germany, and the Philippines. The training labels need to be diverse enough to be able to run on multiple images instead of only running on the images used as training input. It’s been observed that the training labels need to be distanced enough to avoid confusion.

The figures below illustrate some samples of a training label used for each class.

Example of training labels for vegetation class (left), built-up area (middle), water body (right)

It is important to have a variety of image inputs since each country has different weather, social, and economic conditions. For example, the water bodies in Germany look very dark on satellite images, but in Poland, the rivers are dark green. We find that the types of vegetation are also different for each country. Selecting multiple cities and countries allowed us to avoid imbalance and confusion between classes.

Result and analysis

The images below show the spatial patterns of Berlin and Warsaw land covers in 2022 derived from EOfactory’s AI/ML classification algorithm after spending one week training the model. The classification accuracy was assessed by randomly generating sample points across the land cover map and evaluating the land cover class beneath the point against a data reference. Using the same model, we ran the classification over the Sentinel-2 Warsaw image from 2021 and then compared it against the European Space Agency (ESA) Worldcover V2 2021 data. The overall accuracy for the land cover classification is 86%.

Land cover classification map of Berlin, Germany

Land cover classification map of Warsaw, Poland

The figures below describe the quantified area of each class for Berlin and Warsaw land cover classification results.

Land cover class distribution for Berlin, Germany (left) and Warsaw, Poland (right)

EOfactory’s on-the-fly technology

Conventional means of land cover digitization using a manual/semi-automated approach is time and resource intensive, and a non-scalable approach. EOfactory allows users to build their own AI models and run them on the platform, and use automation for some of the other repeatable tasks such as downloading imagery and pre-processing. By operating on the Cloud, we are able to scale up and provide processing power and efficiency much faster than with traditional software. This technology allows for faster decision-making and response to environmental issues.

Author: Ghalizha Zahra Izzaty, Skymap Global Partner Success Manager, EOfactory Trainer