Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Authenticated access to the Argo reference database on password protected erddap #256

Merged
merged 7 commits into from
Apr 26, 2023

Conversation

gmaze
Copy link
Member

@gmaze gmaze commented Apr 14, 2023

In this PR we implement:

  • a login/password authentication method to fetch data from an protected erddap dataset
  • a new data fetcher dedicated to the ship-based CTD database
  • add unit tests for the new methods
  • add documentation page

The exact wording of the reference database content return should be specified later

@gmaze gmaze self-assigned this Apr 14, 2023
@gmaze gmaze added forQCexpert Argo QC expertise is required enhancement New feature or request argo-core About core variables (P, T, S) argo-deep About deep variables (anything below 2000db) labels Apr 14, 2023
@gmaze
Copy link
Member Author

gmaze commented Apr 14, 2023

Here is how to access a protected dataset using a standard login/password authentication:

import fsspec
import aiohttp
import os
import json

async def my_get_client(**kwargs):
    session = aiohttp.ClientSession(**kwargs)
    payload = {
        "user": os.getenv("ERDDAP_USERNAME"),
        "password": os.getenv("ERDDAP_PASSWORD")
    }
    async with session.post("https://erddap-val.ifremer.fr/erddap/login.html", data=payload) as resp:
        print(resp.status)
    return session

filesystem_kwargs = {'simple_links': True, 
                     "block_size": 0}
filesystem_kwargs = {**filesystem_kwargs, 
                     **{"client_kwargs": {"trust_env": False}},
                     **{"get_client": my_get_client}}

store = argopy.stores.httpstore(cache=False, **filesystem_kwargs)
data = store.open_json('https://erddap-val.ifremer.fr/erddap/info/Argo-ref-ctd/index.json')

- also new options 'user' and 'password'
- new httpstores: httpstore_erddap, httpstore_erddap_auth
- black8 erddap_data
- fix bug in erddap_data used with ds='ref'
@gmaze
Copy link
Member Author

gmaze commented Apr 25, 2023

I implemented and integrated the above example in argopy

the new API is the following:

from argopy import CTDRefDataFetcher

box = [15, 30, -70, -60, 0, 5000.0]

with argopy.set_options(user="gmaze", password="***"):
    f = CTDRefDataFetcher(box=box)
    ds = f.to_xarray()

which return a dataset like this:

<xarray.Dataset>
Dimensions:          (N_POINTS: 223233)
Coordinates:
    LATITUDE         (N_POINTS) float64 -66.0 -66.0 -66.0 ... -70.38 -70.38
    LONGITUDE        (N_POINTS) float64 0.1633 0.1633 0.1633 ... 346.5 346.5
    TIME             (N_POINTS) datetime64[ns] 1999-01-18T06:09:00 ... 1990-1...
  * N_POINTS         (N_POINTS) int64 11968 11969 11970 ... 11965 11966 11967
Data variables:
    PRES             (N_POINTS) float64 4.0 6.0 8.0 ... 2.96e+03 2.962e+03
    PSAL             (N_POINTS) float64 34.06 34.06 34.06 ... 34.66 34.66 34.66
    PTMP             (N_POINTS) float64 0.5678 0.5666 0.5623 ... -0.3491 -0.3482
    QCLEVEL          (N_POINTS) <U1 'G' 'G' 'G' 'G' 'G' ... 'G' 'G' 'G' 'G' 'G'
    EXPOCODE         (N_POINTS) object '06ANTXVI' '06ANTXVI' ... '06AQANTIX'
    TEMP             (N_POINTS) float64 0.568 0.5668 0.5626 ... -0.173 -0.172
    DIRECTION        (N_POINTS) <U1 'A' 'A' 'A' 'A' 'A' ... 'A' 'A' 'A' 'A' 'A'
    CYCLE_NUMBER     (N_POINTS) int32 0 0 0 0 0 0 0 0 0 0 ... 8 8 8 8 8 8 8 8 8
    PLATFORM_NUMBER  (N_POINTS) int32 900000 900000 900000 ... 900007 900007
Attributes: (7)

From the raw data, we only have the EXPOCODE. So we complement the dataset with:

  • a fake PLATFORM_NUMBER (starting from 900000), computed by iterating over unique values of the EXPOCODE. So this is not a real WMO number.
  • a fake CYCLE_NUMBER, computed by iterating over unique TIME values for each EXPOCODE samples. So this is not the station number, but instead the profile number for this EXPOCODE.
    The goal of complementing the dataset with these 2 new variables is to be able to manipulate the dataset with the argo accessor, which is quite useful to convert the collection of points into a collection of profiles.

The new CTDRefDataFetcher nearly behaves like the Argo data facade.
There is no "mode" nor "ds" options to provide, these are not necessary for the Argo ship-based reference CTD database

The new options user and password can be set temporarily in a context or permanently at the beginning of the session.

@gmaze gmaze merged commit 5b0cb72 into master Apr 26, 2023
@gmaze gmaze deleted the login-erddap branch April 26, 2023 10:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
argo-core About core variables (P, T, S) argo-deep About deep variables (anything below 2000db) enhancement New feature or request forQCexpert Argo QC expertise is required
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant