Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate to ADS-beta #46

Merged
merged 3 commits into from
Sep 23, 2024
Merged

Migrate to ADS-beta #46

merged 3 commits into from
Sep 23, 2024

Conversation

aethr
Copy link
Contributor

@aethr aethr commented Sep 16, 2024

Description

Resolves #19 .

The Copernicus CDS and ADS systems are being migrated to a new architecture, and all clients must update to use the new "ADS beta" and "CDS beta" before the legacy system is decomissioned on 26/09/2024.

This change in the code is accompanied by a new account being created, and the CDSAPI_ADS_KEY Organisation secret in Github Actions being replaced with new credentials.

Checklist

Please confirm that this pull request has done the following:

  • Tests passing
  • Documentation added (where applicable)

Notes

@aethr aethr self-assigned this Sep 16, 2024
@aethr
Copy link
Contributor Author

aethr commented Sep 17, 2024

It appears the GFAS NetCDF file has changed somewhat in the new API. Where previously it appeared to be NetCDF3 generated by grib_to_netcdf-2.25.1, it is now NetCDF4 generated by GRIB to CDM+CF via cfgrib-0.9.14.0/ecCodes-2.36.0.

This includes a structural change to the time attribute, which is now valid_time.

Original gfas-download.nc (in Panoply):

netcdf ../gfas-download.legacy.nc {
  dimensions:
    longitude = 3600;
    latitude = 1800;
    time = 1;
  variables:
    float longitude(longitude=3600);
      :units = "degrees_east";
      :long_name = "longitude";

    float latitude(latitude=1800);
      :units = "degrees_north";
      :long_name = "latitude";

    int time(time=1);
      :units = "hours since 1900-01-01 00:00:00.0";
      :long_name = "time";
      :calendar = "gregorian";

    short ch4fire(time=1, latitude=1800, longitude=3600);
      :scale_factor = 5.874831290468975E-13; // double
      :add_offset = 1.924947220635064E-8; // double
      :_FillValue = -32767S; // short
      :missing_value = -32767S; // short
      :units = "kg m**-2 s**-1";
      :long_name = "Wildfire flux of Methane";

  // global attributes:
  :Conventions = "CF-1.6";
  :history = "2024-09-16 15:23:15 GMT by grib_to_netcdf-2.25.1: /opt/ecmwf/mars-client/bin/grib_to_netcdf.bin -S param -o /cache/data8/adaptor.mars_constrained.internal-1726500194.713318-15307-14-8af58a44-8084-407d-91a9-afada1eaf3d9.nc /cache/tmp/8af58a44-8084-407d-91a9-afada1eaf3d9-adaptor.mars_constrained.internal-1726500193.5876522-15307-24-tmp.grib";
}

New gfas-download.nc from ADS-beta:

netcdf ../gfas-download.beta.nc {
  dimensions:
    latitude = 1800;
    longitude = 3600;
    valid_time = 1;
  variables:
    double latitude(latitude=1800);
      :_FillValue = NaN; // double
      :units = "degrees_north";
      :standard_name = "latitude";
      :long_name = "latitude";
      :stored_direction = "decreasing";

    float ch4fire(valid_time=1, latitude=1800, longitude=3600);
      :_FillValue = NaNf; // float
      :GRIB_paramId = 210082L; // long
      :GRIB_dataType = "ga";
      :GRIB_numberOfPoints = 6480000L; // long
      :GRIB_typeOfLevel = "surface";
      :GRIB_stepUnits = 1L; // long
      :GRIB_stepType = "avg";
      :GRIB_gridType = "regular_ll";
      :GRIB_uvRelativeToGrid = 0L; // long
      :GRIB_NV = 0L; // long
      :GRIB_Nx = 3600L; // long
      :GRIB_Ny = 1800L; // long
      :GRIB_cfName = "unknown";
      :GRIB_cfVarName = "ch4fire";
      :GRIB_gridDefinitionDescription = "Latitude/Longitude Grid";
      :GRIB_iDirectionIncrementInDegrees = 0.1; // double
      :GRIB_iScansNegatively = 0L; // long
      :GRIB_jDirectionIncrementInDegrees = 0.1; // double
      :GRIB_jPointsAreConsecutive = 0L; // long
      :GRIB_jScansPositively = 0L; // long
      :GRIB_latitudeOfFirstGridPointInDegrees = 89.95; // double
      :GRIB_latitudeOfLastGridPointInDegrees = -89.95; // double
      :GRIB_longitudeOfFirstGridPointInDegrees = 0.05; // double
      :GRIB_longitudeOfLastGridPointInDegrees = 359.95; // double
      :GRIB_missingValue = 3.4028234663852886E38; // double
      :GRIB_name = "Wildfire flux of Methane";
      :GRIB_shortName = "ch4fire";
      :GRIB_totalNumber = 0L; // long
      :GRIB_units = "kg m**-2 s**-1";
      :long_name = "Wildfire flux of Methane";
      :units = "kg m**-2 s**-1";
      :standard_name = "unknown";
      :GRIB_number = 0L; // long
      :GRIB_surface = 0.0; // double
      :coordinates = "valid_time latitude longitude";
      :_ChunkSizes = 1U, 57U, 225U; // uint

    double longitude(longitude=3600);
      :_FillValue = NaN; // double
      :units = "degrees_east";
      :standard_name = "longitude";
      :long_name = "longitude";

    long valid_time(valid_time=1);
      :standard_name = "time";
      :units = "seconds since 1970-01-01";
      :calendar = "proleptic_gregorian";
      :long_name = "time";

  // global attributes:
  :GRIB_centre = "ecmf";
  :GRIB_centreDescription = "European Centre for Medium-Range Weather Forecasts";
  :GRIB_subCentre = 0L; // long
  :Conventions = "CF-1.7";
  :institution = "European Centre for Medium-Range Weather Forecasts";
  :history = "2024-09-17T00:33 GRIB to CDM+CF via cfgrib-0.9.14.0/ecCodes-2.36.0 with {\"source\": \"data.grib\", \"filter_by_keys\": {\"typeOfLevel\": \"surface\"}, \"encode_cf\": [\"parameter\", \"time\", \"geography\", \"vertical\"]}";
}

Note the valid_time variable has changed to :units = "seconds since 1970-01-01"; where the original time variable was :units = "hours since 1900-01-01 00:00:00.0";.

@aethr
Copy link
Contributor Author

aethr commented Sep 17, 2024

The change in format from time to valid_time isn't a problem itself, as we use nc.num2date(ncin.variables["time"][:], ncin.variables["time"].getncattr("units")) to convert values to a universal format. However, it appears there might be a bug in CDS itself, as the valid_time values seem to be offset from what's actually being requested.

Using the legacy API, fetching data for a period from 2023-01-01 to 2023-01-02:

$ poetry run python src/openmethane_prior/layers/omGFASEmis.py --start-date 2023-01-01 --end-date 2023-01-02

CAMS request:

c.retrieve(
        "cams-global-fire-emissions-gfas",
        {
            "date": "2023-01-01/2023-01-02",
            "format": "netcdf",
            "variable": [
                "wildfire_flux_of_methane",
            ],
        },
        file_name,
    )

Fetched date range is correct, going from the 2023-01-01 to 2023-01-02:

[cftime.DatetimeGregorian(2023, 1, 1, 0, 0, 0, 0, has_year_zero=False)
 cftime.DatetimeGregorian(2023, 1, 2, 0, 0, 0, 0, has_year_zero=False)]

Running the same command but using the new API, generates the same CDS request, but the valid_time range ends up as:

[cftime.DatetimeGregorian(2023, 1, 2, 0, 0, 0, 0, has_year_zero=False)
 cftime.DatetimeGregorian(2023, 1, 3, 0, 0, 0, 0, has_year_zero=False)]

@aethr
Copy link
Contributor Author

aethr commented Sep 17, 2024

I've raised a support request about the off-by-one bug in the ADS beta: https://jira.ecmwf.int/plugins/servlet/desk/portal/1/CUS-26007

@prayner
Copy link
Contributor

prayner commented Sep 17, 2024 via email

@aethr
Copy link
Contributor Author

aethr commented Sep 17, 2024

I've made some fixes that allowed me to re-enable the test_gfas_emis test, which is now PASSING. However, I'm not familiar enough with the test to understand if the checks it's doing are meaningful.

The fact that it passes despite the dates in the fetching GFAS data being wrong might be an issue.

The fact that the e2e om_prior tests also pass is good from a mechanical point of view, since I believe that means the changes in this PR are successful in integrating the new data source. But it may be bad from a QA point of view if the data we're fetching comes back with dates that don't match the rest of the data being processed, and this doesn't cause any alarms.

In the short term, I believe we should wait to merge this until we hear back from the ADS team about the off-by-one issue. If we must merge this to address the legacy API deprecation on Sept 26, then we may want to shift our ADS API request by one day, and then assert on the dates that we get back so that we can be alerted when/if their bug is fixed.

@aethr aethr requested review from lewisjared and prayner September 17, 2024 03:58
Comment on lines +56 to +57
downloadPath = pathlib.Path(file_name);
downloadPath.parent.mkdir(parents=True, exist_ok=True)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cdsapi client is not smart enough to create the containing path before attempting to download. This fixes an issue when running the test_gfas_emis test or running omGFASEmis.py standalone in a path with no intermediates folder.

This folder prep might be better handled in the config but I wanted to touch as little as possible.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thats fine. This is a noop if the directory already exists

@@ -84,7 +89,8 @@ def processEmissions(config: PriorConfig, startDate, endDate, forceUpdate: bool
gfas_file = download_GFAS(
startDate, endDate, file_name=config.as_intermediate_file("gfas-download.nc")
)
ncin = nc.Dataset(gfas_file, "r", format="NETCDF3")
ncin = nc.Dataset(gfas_file, "r", format="NETCDF4")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new dataset format is NetCDFv4.

Comment on lines 213 to 220
dates.append(startDate + datetime.timedelta(days=i))
dates.append(gfasTimes[i])
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change could be problematic. The previous version iterated based on the provided startDate which would presumably align with the startDate supplied to other layers. However this didn't appear to work even with the legacy ADS API when running in test scenarios.

This change uses the dates of the data actually returned by the cdsapi call, but there's a risk that these dates are out of alignment from the datasets used by every other layer, if there are bugs in the cdsapi (which I think there are).

@aethr
Copy link
Contributor Author

aethr commented Sep 17, 2024

Rersponse from Kevin Marsh @ ECMWF Support:


I think these variables are accumulated over 24 hours, so looking at the original (GRIB) data for your request :

% grib_ls 32e348309ddfe25309b35ad2f8327b13.grib

32e348309ddfe25309b35ad2f8327b13.grib

edition centre typeOfLevel level dataDate stepRange dataType shortName packingType gridType

1 ecmf surface 0 20230101 0-24 ga ch4fire grid_simple regular_ll

1 ecmf surface 0 20230102 0-24 ga ch4fire grid_simple regular_ll

2 of 2 messages in 32e348309ddfe25309b35ad2f8327b13.grib

So the valid time' is set to the end of the 24 hour period (i.e. 0000 on the following day)

2 of 2 total messages in 1 files

% grib_ls -p validityDate,validityTime 32e348309ddfe25309b35ad2f8327b13.grib

32e348309ddfe25309b35ad2f8327b13.grib

validityDate validityTime

20230102 0

20230103 0

2 of 2 messages in 32e348309ddfe25309b35ad2f8327b13.grib

And this is how the data are encoded in the netCDF file:

2 of 2 total messages in 1 files

% cdo info 7146ff2cddb82ae5c09a7705b64cd2cd.nc

-1 :       Date     Time   Level Gridsize    Miss :     Minimum        Mean     Maximum : Parameter ID

 1 : 2023-01-02 00:00:00       0  6480000       0 :      0.0000  2.5515e-13  3.8500e-08 : -1            

 2 : 2023-01-03 00:00:00       0  6480000       0 :      0.0000  2.8209e-13  3.8500e-08 : -1            

cdo info: Processed 12960000 values from 1 variable over 2 timesteps [0.15s 83MB].

So the time (GMT) is set to the end of the 24 hour period (rather than the start/middle)

@aethr
Copy link
Contributor Author

aethr commented Sep 17, 2024

So it appears there isn't a bug per-se, we're getting the data we've requested, it's just a change in behaviour from the previous system which encodes the end of the time period in valid_time instead of the beginning, and the unfortunate effect of using midnight instead of 23:59:59 is that the date moves into the next day.

Knowing this, do we need to time-shift the returned dates back by one so they match the rest of our data?

@prayner
Copy link
Contributor

prayner commented Sep 17, 2024 via email

@prayner
Copy link
Contributor

prayner commented Sep 18, 2024 via email

@aethr
Copy link
Contributor Author

aethr commented Sep 18, 2024

I haven't completed the change to shift the dates back by one day so they match the dates requested.

Then I think we need to action similar PRs in the other repos that use ADS/CDS data and do some checking of the return dates on those as well to verify the behaviour.

We also need to think about how we'll merge the changes. If we want to update the API keys in Github, we'll need to coordinate merges to all the repos at the same time, and then merge main back into any open PRs/branches. Or we could create new secrets for the new API credentials and eventually deprecate the old ones.

@aethr
Copy link
Contributor Author

aethr commented Sep 19, 2024

@prayner and I paired to update this branch and shift the dates from GFAS to match the requested dates. All tests are passing locally, they're only failing in CI because the secret still contains the legacy key.

Copy link
Collaborator

@lewisjared lewisjared left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I see you just reran the tests so if they pass then we should be fine

@aethr aethr merged commit 665edee into main Sep 23, 2024
8 checks passed
@aethr aethr deleted the update-ads-api branch September 23, 2024 02:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Evaluate impact of updates to CDS
3 participants