Your Processing Environment

CreoDias Data Finder API


Using http API to query EO Data Finder

All queries may be executed as simple HTTP-Get calls, by typing the query in web browser address line, by using any HTTP client, e.g. curl or wget, or from inside of users’ program. The database is accessible free and anonymously (open for anonymous access for everyone, no authorization is used) It may be accessed both from internal network (virtual machines in CreoDIas) and from outside, e.g. your home computer. Note, that the actual EO data themselves are restricted to to CreoDIas users, only the catalogue (EO Data Finder) is open.

For example you may get the 10 most recent products from Poland using the command:

$ wget -O - ""

Or just typing the URL in address line of your browser:


General Rules

The queries may produce their results either in JSON or XML formats. To select appropriate form, use corresponding engine in the URL: - for JSON - for XML response

The responses are by default formatted in compact form for machine interpretation. If you want the output to be formatted with space and newlines in human readable form add _pretty=true to the query (like in the examples above).

The responses as text lists of products are not implemented directly yet. If you need just such a list, without additional info, provided by JSON, the solution is to filter out the formatted JSON response. For example the following filtered command returns name list of 1000 most recent Sentinel-1 products from Poland.

$ wget -O - "" | grep "productIdentifier" | egrep -o '\\[^"]++' | sed 's/\\//g'

Most queries (except of “natural language” ones) are case-sensitive.


Product metadata

The metadata are available for all products in the form provided by original data publishers. For example, for Sentinel-2 products, the main metadata for each of them is available in XML form as

where path to the product and its name are accessible from the productIdentifier field of the response, finding it (as above).

All product metadata are available free and anonymously.



The data are organized in so called collections, corresponding to various satellites. A query may search data in all collections, or in one particular collection only. If only one satellite is in the field of interest, the second approach is faster and more efficient, than filtering the general query. For example, to find 10 most recent Sentinel-2 products from Poland, the query should look like:

while if the collection field is missing in the URL, the products from all the satellites are returned:


As for today the following collections are defined and may be used:








but in a future, additional collections will be defined, especially Sentinel3 will be implemented soon. The actual list of available collections (satellites) may be obtained as

Note, that collection names vary a bit from satellite names, as they are used in EO Data repository. For example, the collection is named Sentinel2, while in the repository its data are located within /eodata/Sentinel-2/.... branch of the repository tree.


Output sorting and limiting

By default, maximum 10 most recently published products are returned only. You may change the limit (beware of long execution time for queries about thousands of products) using the phrase



If the query is very general and the number of matching products is large, the next pages of products may be retrieved

startIndex=nnn or startPage=nnn


You may also change the order how the products are presented, using the phrase like


will sort the output by observation date rather than by publication date. The following orderings are implemented:

published - the date when the product got published in our repository (default)

startDate - the date when the observation was made

cloudCover - the cloudiness of the scene

each of them may be accompanied by

sortOrder=ascending or sortOrder=descending


For example the query July 2016&sortParam=cloudCover&maxRecords=20

will return 20 least cloudy products from Poland from July 2016, while the next query would return the next 20, a little bit cloudier, such product: July 2016&sortParam=cloudCover&maxRecords=20&startIndex=21

The last sub phrase, selecting next page of the results, may be also written as startPage=2


Proper query

The proper query is invoked as a sequence of sub phrases, separated by &. The result is an conjunction of all sub phrases. It is impossible to use alternative in the question. The query might be specified in two ways: as natural language query or as formal one. Mixing of both is also possible.


The example of natural language (all Sentinel-2 products from Warsaw from Summer 2016): Summer 2016


The example of formal query, giving virtually the same results:


The example of mixed query - about cloudless (cloud cover below 10%) products from Warsaw Summer 2016: Summer 2016&cloudCover=[0,10]


Natural language queries

Their form is q=natural language question . The question formulated in natural language is interpreted in a simplest way as conjunctive set of keywords. Only the known keywords are recognised and all other words are ignored. The keywords are not case sensitive and often may use alternative spellings, e.g. you may specify the name of the city as Gdańsk (using Polish accented letter ‘ń’) or as gdansk (with plain Latin alphabet).


The answer reflects the conjunction of all matching keywords. The possible keywords are:

  • city names, in English spelling, if such exists, like Warsaw, but in local spelling if English one is not used, like Łódź (or Lodz)
  • country names, like Poland
  • region names, like Tirol
  • seasons, month names, days of week, years (4 digits), day of month (1 or 2 digits)
  • terrain features: land, water, forest, etc. Only land/water selection is implemented globally.


The example of query, finding Landsat-8 scenes from town of Otwock on 28th of June 2016: 28 June 2016

Note that empty responses may result both from lack of matching records, and from misspelled queries. Those two reasons are indistinguishable. For example, if you query about the products from Otwock on 14th of June in the same way, the result would be empty (no such products are available in our repository). But you cannot find from the answer if the empty response was caused by lack of data from 14th of June, or maybe Otwock is too small town to have its name recognised.


Formal queries

The queries are in form param=value or param=[minvalue,maxvalue]. Most of the parameters are common for all collections, but some are specific for some them (e.g. cloudCover applies to optical satellites, but polarisation applies to radar ones), or just single one.


Geography and time-frame

The common set of parameters are:

startDate, completionDate - the date limits of the observation. The time may also be specified, e.g. 2016-10-01T021:30:00Z Format as defined by RFC-3339

publishedAfter, publishedBefore - the date limits when the product was published in our repository

lon, lat - geographical position, expressed in military style (EPSG:4326, as decimal fraction of degrees, positive for eastern latitude and northern longitude)

geometry - region of interest, defined as WKT string (POINT, POLYGON, etc.)

box - region of interest, defined as the rectangle with given (west,south,east,north) values

name - region of interest, given as name (city name, country name, etc., the same as in natural language queries)


Region of interests may also be specified by KML file, attached to HTTP/POST query, e.g.

curl -X POST 2016 –data @file/my.kml

will find all Sentinel-2 data from Summer 2016 from the area described as KML file my.kml . You may also use Geo-JSON formatted files instead of KML.

Note that KML and Geo-JSON queries are restricted to simply connected areas (simple polygons).


Terrain features

Some terrain features may be checked in the query. The following are defined:

waterCover - percentage of the scene, covered by waters, e.g. waterCover=[5,100] selects only scenes, covering at least to some degree by seas;








Only water/land distinction is well defined for all areas, but even for them, smaller inland waters are sometimes still marked as "land". Accuracy and availability of other distinction depend on region, and generally should be treated as a clue only.


Volatile features

Some terrain-like feature masks are not permanent but describing single scene only. The most commonly used such feature is cloudiness, or cloudCover, which is defined for most of the products coming from optical sensors. For example:


selects only those scenes, which are covered by clouds by no more than 10%.

Caution: to be meaningful, the cloudiness must be provided with each product, while in many products is missing. If the cloudiness is unknown for the scene, it is marked by value of 0 or -1. cloudCover=0 is therefore ambiguous: it may either mean totally cloudless sky or the cloudy scene for which cloud cover had not been estimated during original data processing.

Other such volatile features are:




They are defined and supported by EO Data Finder, but the corresponding data masks are properly provided for very few products only.


Satellite features

instrument - meaningful only for satellites equipped with multiple instruments. The possible values are satellite specific.

productType - the actual types possible are specific for every satellite. For Sentinel-1 data you may select for example productType=GRD for products processed in standard mode for land observations.

sensorMode - also satellite and sensor specific. E.g. (for Sentinel-1):


orbitDirection - ascending or descending. For most heliosynchronous satellites descending orbits means the day scenes, while ascending means night ones. For many optical satellites (e.g. Sentinel-2) only day scenes are published.


Some additional parameters are strictly satellite-specific, e.g. polarisation, which is defined only for Sentinel-1

For every satellite (collection) its set of query-able parameters may be obtained by query like:

The resulting XML file provides full list of the parameters for the collection, with their very brief descriptions.


Alternative queries

It is impossible to formulate logical alternative in a query. Such alternatives must be implemented as a sequence of queries, and concatenating their results. For example, the following script returns the list of Sentinel-1 products from Summer 2016 from England and Ireland:

wget -O – " summer 2016" > /tmp/x

wget -O – " summer 2016" >> /tmp/x

cat /tmp/x | grep "productIdentifier" | egrep -o '\\[^"]++' | sed 's/\\//g' | sort -u

rm -f /tmp/x


Interactive generation of queries

If you use the interactive (browser based) EO Data Finder

every query executed by it is displayed in a “query” field in its left panel. It may be copied from there and then used by API programming. If you logged in to the EO Data Finder as a registered user, the history of recently executed queries is also available and may be used for API access. Not all the API features may be created interactively, but on the other hand, all queries, generated interactively, are also valid for API.

For beginners it is often good idea to take interactively generated query, and eventually only modify it, rather than writing own queries from scratch.


User’s Guide  Ver. 1.0 2018-06-01