Forum

Message Boards Message Boards

Back

Automatic Download (using python)?

Toggle
Automatic Download (using python)?
Answer
27 May 2020 2:04
Hey Guys,
I am trying to download a lot of data that I can get using the finder-API (https://finder.creodias.eu/resto/api/collections/Sentinel1/search.json?maxRecords=2000&startDate=2018-09-01T00:00:00Z&completionDate=2019-04-01T23:59:59Z&productType=GRD&sensorMode=IW&sortParam=startDate&sortOrder=descending&status=all&geometry=POLYGON((25.471802+58.031372000000005,25.471802+58.81658600000003,26.872559000000003+58.81658600000003,26.872559000000003+58.031372000000005,25.471802+58.031372000000005))&dataset=ESA-DATASET).

I can download all the scenes manually by clicking them in the finder and save them, but this would take a lot of time and feel dumb ;) . Is there a way I can auto-download the images, if possible maybe even using python? (Comparable to sentinelsat for the copernicus-hub?)

Today is my first contact with CREODIAS, thus I might just need a hint in the right direction because I didn't find the right information by now.

EDIT: I managed to download single files using requests.urllib.get() but unfortunately the token is changing thus I need to find a way to create one on a Windows-Machine

Thanks,
Felix
0 (0 Votes)

RE: Automatic Download (using python)?
Answer
3 June 2020 2:28 as a reply to Felix Keßler.
Hello Felix,

We've made a Python script for you, just update finder_api_url, username and password variables in below script:

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657
import requests
import json
# from pprint import pprint
from tqdm import tqdm
import re
import sys

def get_keycloak_token(username, password):
    h = {
    'Content-Type': 'application/x-www-form-urlencoded'
    }
    d = {
    'client_id': 'CLOUDFERRO_PUBLIC',
    'password': password,
    'username': username,
    'grant_type': 'password'
    }
    resp = requests.post('https://auth.creodias.eu/auth/realms/dias/protocol/openid-connect/token', data=d, headers=h)
    # print(resp.status_code)
    try:
        token = json.loads(resp.content.decode('utf-8'))['access_token']
    except KeyError:
        print("Can't obtain a token (check username/password), exiting.")
        sys.exit()
    # print(token)
    return token

finder_api_url = '''https://finder.creodias.eu/resto/api/collections/Sentinel1/search.json?maxRecords=5&startDate=2018-09-01T00:00:00Z&completionDate=2019-04-01T23:59:59Z&productType=GRD&sensorMode=IW&sortParam=startDate&sortOrder=descending&status=all&geometry=POLYGON((25.471802+58.031372000000005,25.471802+58.81658600000003,26.872559000000003+58.81658600000003,26.872559000000003+58.031372000000005,25.471802+58.031372000000005))&dataset=ESA-DATASET'''
username = 'creodias_username_here'
password = 'creodias_password_here'

response = requests.get(finder_api_url)
for feature in json.loads(response.content.decode('utf-8'))['features']:
    token = get_keycloak_token(username, password)
    download_url = feature['properties']['services']['download']['url']
    download_url = download_url + '?token=' + token
    total_size = feature['properties']['services']['download']['size']
    title = feature['properties']['title']
    filename = title + '.zip'
    r = requests.get(download_url, stream=True)
    if "Content-Disposition" in r.headers.keys():
        filename = re.findall("filename=(.+)", r.headers["Content-Disposition"])[0]
    # Total size in bytes.
    total_size = int(r.headers.get('content-length', 0))
    if total_size <= 100:
        print(r.text)
        sys.exit("Please try again in few moments.")
    block_size = 1024 #1 Kibibyte
    print('downloading:', filename)
    t=tqdm(total=total_size, unit='iB', unit_scale=True)
    with open(filename, 'wb') as f:
        for data in r.iter_content(block_size):
            t.update(len(data))
            f.write(data)
    t.close()
    if total_size != 0 and t.n != total_size:
        print("ERROR, something went wrong")

Keep in mind that the maximum products you can find, thus download with this script, at once is 2000. To overcome this add '&page=<page_number>' to the finder_api_url, and change <page_number> iterativly after each use of the script.


Regards,
Marcin
0 (0 Votes)

RE: Automatic Download (using python)?
Answer
4 June 2020 5:18 as a reply to Marcin Szreder.
Hello Marcin,
Thanks a lot for this script. Right now it is not working, I am not registered at Creodias itself but I am able to access the data through Authentification via Code-DE, thus my Username/Password-Combination is not valid for Creodias.
Is it possible to update the code for this to work as well?

Thanks a lot for your work!
Felix
0 (0 Votes)

RE: Automatic Download (using python)?
Answer
15 June 2020 7:15 as a reply to Felix Keßler.
Hey Marcin,
I was wondering if there is an Update or a workaround you could provide for my problem?
0 (0 Votes)

RE: Automatic Download (using python)?
Answer
15 June 2020 8:17 as a reply to Felix Keßler.
Hello

Sorry for late reply. Below is a modified script to use with CODE-DE platform.


123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

import requests
import json
# from pprint import pprint
from tqdm import tqdm
import re
import sys


def get_keycloak_token(username, password):
    h = {
        'Content-Type': 'application/x-www-form-urlencoded'
    }
    d = {
        'client_id': 'FINDER',
        'password': "*****",
        'username': "******",
        'grant_type': 'password'
    }
    resp = requests.post(
        'https://auth.cloud.code-de.org/auth/realms/code-de/protocol/openid-connect/token', data=d, headers=h)
    # print(resp.status_code)
    try:
        token = json.loads(resp.content.decode('utf-8'))['access_token']
    except KeyError:
        print("Can't obtain a token (check username/password), exiting.")
        sys.exit()
    # print(token)
    return token


finder_api_url = '''https://finder.code-de.org/resto/api/collections/Sentinel2/search.json?maxRecords=10&startDate=2020-05-28T00:00:00Z&completionDate=2020-05-29T23:59:59Z&location=all&sortParam=startDate&sortOrder=descending&status=all&geometry=POLYGON((12.209161052771645+51.41610114539586,12.207943043941432+51.242201496186965,12.549586691921304+51.243726461021595,12.548368729553374+51.41496171793736,12.209161052771645+51.41610114539586))&dataset=ESA-DATASET'''
username = 'creodias_username_here'
password = 'creodias_password_here'
response = requests.get(finder_api_url)
for feature in json.loads(response.content.decode('utf-8'))['features']:
    token = get_keycloak_token(username, password)
    download_url = feature['properties']['services']['download']['url']
    download_url = download_url + '?token=' + token
    total_size = feature['properties']['services']['download']['size']
    title = feature['properties']['title']
    filename = title + '.zip'
    r = requests.get(download_url, stream=True)
    if "Content-Disposition" in r.headers.keys():
        filename = re.findall(
            "filename=(.+)", r.headers["Content-Disposition"])[0]
    # Total size in bytes.
    total_size = int(r.headers.get('content-length', 0))
    if total_size <= 30:
        print("Please try again in few moments.")
        sys.exit()
    block_size = 1024  # 1 Kibibyte
    print('downloading:', filename)
    t = tqdm(total=total_size, unit='iB', unit_scale=True)
    with open(filename, 'wb') as f:
        for data in r.iter_content(block_size):
            t.update(len(data))
            f.write(data)
    t.close()
    if total_size != 0 and t.n != total_size:
0 (0 Votes)

RE: Automatic Download (using python)?
Answer
17 June 2020 8:45 as a reply to Marcin Szreder.
Hey Marcin,
I think I found the reason why it is not working, I think you misunderstood what I wanted to achieve (hopefully it is actually possible):
I am registered at CodeDE and can use this to "log in" to the CreodiasFinder and download data. CodeDE has some data stored, but unfortunately only for Germany. Since I want to use data outside of Germany I wanted to download the data from Creodias.
If I go to https://finder.creodias.eu/ and search for data I can log in using the "Federated Identity Provider"-Tab and there CodeDE. If I am logged in I can download the data manually.
The code you provided was for automation of download if I have a CodeDE-API query. But I want to download from Creodias using my CodeDE-login data.

I hope this is more precise now and you get what I want. I think every time your code worked was when I was using your API-Query which I did not recognize is from CodeDE.

Thanks again!
0 (0 Votes)

RE: Automatic Download (using python)?
Answer
22 June 2020 11:59 as a reply to Felix Keßler.
Hey Marcin,
I wanted to know if there is an update for my problem?

Best,
Felix
0 (0 Votes)

RE: Automatic Download (using python)?
Answer
25 June 2020 8:39 as a reply to Felix Keßler.
Hello,
Sorry for late reply - there's still no update. I've been encountering some technical issues on my side, will try to provide solution for your problem soon.

Regards,
Marcin
0 (0 Votes)

RE: Automatic Download (using python)?
Answer
26 June 2020 6:21 as a reply to Marcin Szreder.
Thanks for the No-Update-Update! Still happy to know that you at least think it is possible!
0 (0 Votes)

RE: Automatic Download (using python)?
Answer
22 July 2020 6:16 as a reply to Felix Keßler.
Hey Marcin,
I am still waiting for an update. Did you manage to find a solution by now?
0 (0 Votes)

RE: Automatic Download (using python)?
Answer
13 August 2020 9:34 as a reply to Felix Keßler.
Hi,
Sorry for late reply - unfortunately I couldn't find a solution for this problem.

Regards,
Marcin
0 (0 Votes)

RE: Automatic Download (using python)?
Answer
26 July 2021 6:18 as a reply to Marcin Szreder.
I've also taken adavantage of this great sample script. Thank you so much for providing this!
0 (0 Votes)

RE: Automatic Download (using python)?
creodias sentinel 2 jupyter notebook python api
Answer
7 September 2022 12:27 as a reply to Marcin Szreder.
Hi,

I also need to download several hundred images from creodias. I am looking for a way to use the API and python. I am trying to use the code suggeested in this post. I have specified my user ID and passoword, changed the finder_api_url for the one of my search and run it. I am getting this error. Any idea to solve it?

1
ConnectionError: [color=inherit][font=var(--jp-code-font-family)][size=1]HTTPSConnectionPool(host='finder.creodias.eu', port=443): Max retries exceeded with url: /resto/api/collections/Sentinel2/search.json?maxRecords=2000&startDate=2022-06-01T00%3A00%3A00Z&completionDate=2022-08-31T23%3A59%3A59Z&processingLevel=LEVEL1C&geometry=POLYGON((14.573304590504197+52.58552883837547%2C14.683963858632376+52.536196192387564%2C14.647077435922986+52.42387013063967%2C14.573304590504197+52.58552883837547))&sortParam=startDate&sortOrder=descending&status=all&dataset=ESA-DATASET (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x000001C30B5C0610>: Failed to establish a new connection: getaddrinfo failed'))[/size][/font][/color]
0 (0 Votes)

RE: Automatic Download (using python)?
Answer
7 September 2022 6:08 as a reply to Miguel Castro.
The error message states you have exceeded the max number of entries. Currently it appears to be set to maxRecords=2000, I have previously circumvented this issue by setting this to below 1000, and shortening the date range. You can then simply re-run the query on the next set of dates after that 'batch' has been downloaded. 
0 (0 Votes)

RE: Automatic Download (using python)?
creodias sentinel 2 python api d
Answer
8 September 2022 7:18 as a reply to Ramon Melser.
Hi Ramon,

Thanks for your feedback. I have tried with your suggestion, putting the sensing date from 2022-06-01 to 2022-06-10 and the maxrecords to 100 and even 10 but still facing the same issue. 
Any further idea on what could be the cause? I have tried within a corporate network and out of it in case this could be the cause but facing the issue on both tests. 

By the waz, I have tried with a much simple code to test the response module and if I run the code below, I get the same error. I wonder if it is a network issue or an error on the API side (which I doubt). Could you confirm the code posted runs as it is on your side 

x = requests.get('https://w3schools.com')print(x.status_code)


ConnectionError: HTTPSConnectionPool(host='finder.creodias.eu', port=443): Max retries exceeded with url: /resto/api/collections/Sentinel2/search.json?maxRecords=10&startDate=2022-06-01T00%3A00%3A00Z&completionDate=2022-06-08T23%3A59%3A59Z&processingLevel=LEVEL1C&geometry=POLYGON((14.573304590504197+52.58552883837547%2C14.683963858632376+52.536196192387564%2C14.647077435922986+52.42387013063967%2C14.573304590504197+52.58552883837547))&sortParam=startDate&sortOrder=descending&status=all&dataset=ESA-DATASET (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x000001C30B89D790>: Failed to establish a new connection: getaddrinfo failed'))
0 (0 Votes)

RE: Automatic Download (using python)?
Answer
5 October 2022 7:10 as a reply to Miguel Castro.
Hi!

I am not a technical support, just an user

The error getaddrinfo failed seems to me something related with the hostname, in particular, problems resolving the hostname

https://stackoverflow.com/questions/7334199/getaddrinfo-failed-what-does-that-mean

There, they suggest also some problems caused by proxy settings (proxies are usually a big cause of problems when you are in an office...).
0 (0 Votes)