WHY JUPYTER NOTEBOOKS?
Jupyter Notebooks is a great way to explore data and process it on WEkEO.
You can work with Jupyter Notebooks directly from the WEkEO portal instead of using a virtual machine (VM). The Harmonised Data Access (HDA) API can be used from the Jupyter Notebooks enabling you to query and access data for your processing needs.
Creating a Jupyter Notebooks from the portal
First, login to WEkEO and go to the top right side of the screen. Select your username and open the drop-down menu to find the "JupyterHub" option. Once you are in the JupyterHub homepage, click on the green button to select "My Server". The server is started so that you can access your Jupyter Notebooks workspace.
The Jupyter Notebooks Workspace
In the JupyterHub there are some default folders and files available for all the users, but you can also create new files and folders to organize your workspace.
Default files and folders
Bear in mind that the default folders and files cannot be deleted as you will need them to download datasets. Read the section "Harmonised Data Access (HDA) API in Jupyter Notebooks" to learn more about the procedure to download the datasets.
- samples: this folder contains the file "How-To Guide of the Harmonized Data Access vX.Y.Z.ipynb" which is required to query and access datasets using the Harmonised Data Access (HDA) API.
- products: all the downloaded datasets using Jupyter Notebooks will be stored in this folder.
You can create new folders to store the downloaded datasets and other Jupyter Notebook files (*.ipynb), -that yuo might need if you want to work online with the data. These files and folders will only be visible for you and the other WEkEO users will not have access to them.
Harmonised Data Access (HDA) API in Jupyter Notebooks
This section provides insights into the "How-To Guide of the Harmonized Data Access vX.Y.Z.ipynb" notebook that demonstrates the use of HDA API for querying and accessing datasets.
In order to download the data, you can run each of the cells (sections) separately or you can run all the steps of the notebook using the button "Run" on the upper menu. There are some parts of the Jupyter Notebooks that need to be modified according to the data that will be downloaded. These changes are only needed in step 1 and step 5.
Steps 1 to 7 are required to download the data. Additionally, depending on how you want to download the data, you might continue with step 8 or with steps 9 and 10.
- To download data locally, go to Step 8: Get Results Download Link
- To download data in streaming mode, go to Step 9: Download to your JupyterNotebooks workspace. This means that your data will be downloaded in the JupyterHub server, but not in your computer. If you want to plot these data in the same Jupyter Notebooks you can go to Step 10: Access and plot your data.
These options are not exclusive, you might like to do both, and you can do it by running steps 8, 9 and 10.
Step 1: Initialization - WEkEO Endpoints and API Key
In this section, the WEkEO HDA endpoints are set. An example dataset is given in the dataset ID and it should be changed by the user according to the dataset of interest (selected in blue).
Step 2: Find your API Key
To find your API Key go to the top right part of the screen. Click your username, open the drop-down menu and select Subscriptions. At the end of the page, click the button "Show hidden keys". Go to the section "User credentials" and copy your encoded key. Paste it then in the Jupyter notebooks field #api_key = "<API encoded key>". Remember to uncomment the line by deleting the #.
Step 3: Get Access Token
Run this cell to get the token and proceed to the next step. The access token is valid for one hour.
Step 4: Query Metadata
Once the access token is available, you can query the dataset of interest. Run this cell and proceed to step 5.
Step 5: Accept Terms and Conditions
Run this cell to accept the Copernicus terms and conditions. You can then proceed to the following step.
Step 6: Query your Dataset Products
In this section, the user has to change the data inside the blue square in the screenshot bellow.
These data can be obtained from the HDA option in the Dataset Navigator. There, select the parameters of your interest and open the tab at the bottom, "Request Payload". You can copy the data of the payload to the area selected in blue in the Jupyter Notebooks screenshot(above).
Step 7: Check job status
After running this cell, you should obtain a successfully response about a completed job. Sometimes, depending on the data to be downloaded, it takes longer for the job to finish and some 'False' messages might appear instead of a 'True'. This is not a problem as long as a final meassage appears as 'True'. When the job finishes, the number of products available and the number of results that will be downloaded is shown.
Step 8: Get Results List
The query results are paginated. Parameters for page number and the number of results per page can be used to fetch only the necessary results. The pages are numbered from 0 (i.e. the number of the first page is 0). In the example below, each page contains 5 results (size='5') and we are going to show the results from the 1st page (zero based numbering).
After running this cell, a list with the results that can be downloaded is shown. As explained before, you can either go to step 9 or to step 10 to download the data.
Step 9: Get Results Download Link
If all the previous steps have worked correctly, you will see a list with all the files to download after running this cell.
You can click on these links in order to download the files locally, in your computer.
Step 10: Download to your Jupyter Notebooks Workspace
In this step, run the two following cells to download the data and store it in Jupyter Notebooks itself.
The downloaded data will be available in the folder named products, once the download of files is finished.
Step 11: Access and Plot your Data
Bear in min
Harmonised Data Access (HDA) API in Jupyter Notebook video demo
This section provides insights into "sample.ipynb" notebook that demonstrates the use of HDA API for querying and accessing datasets.