Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

assess the WIS 1.0 catalogue for availability of NWP data #103

Open
8 tasks done
tomkralidis opened this issue May 17, 2021 · 12 comments
Open
8 tasks done

assess the WIS 1.0 catalogue for availability of NWP data #103

tomkralidis opened this issue May 17, 2021 · 12 comments
Assignees
Labels
question Further information is requested

Comments

@tomkralidis
Copy link
Collaborator

tomkralidis commented May 17, 2021

cc @efucile

As a result of the forthcoming WMO data policy, assess WIS 1.0 catalogue to determine the availability of NWP data on WIS.

  • download WIS catalogue from a given GISC via OAI-PMH or zip bundle
  • detect NWP data patterns in WCMP documents (GRIB2/FM-92 format identification, etc.)
    • look for FM-92
    • look for keywords
    • organize by originating centre
  • report/group by country, RA, etc.
  • determine workflow (OAI/SRU queries, download catalogue and run against local files?)
  • determine where to implement (as a subcommand in pywcmp, or perhaps we need a pywis package, thoughts?)
@tomkralidis tomkralidis added the question Further information is requested label May 17, 2021
@tomkralidis
Copy link
Collaborator Author

Update 2021-05-21: @efucile will provide list of centers providing NWP.

@josusky and @tomkralidis to develop MVP.

@tomkralidis tomkralidis self-assigned this May 21, 2021
@efucile
Copy link
Member

efucile commented Jun 2, 2021

@tomkralidis and @josusky this is a list of GDPFS centres. There is a column saying which of them is supposed to provide data in WIS. This can be used to check which of the centres in the list is also providing data in the catalogue.
GDPFS Centers

@josusky
Copy link
Contributor

josusky commented Jun 3, 2021

Just for the record,, the whole WIS metadata catalogue can be downloaded from https://gisc.dwd.de/oaidownload/wis-catalogue.tar.gz. I guess that can be considered and authoritative source :-)

@tomkralidis
Copy link
Collaborator Author

Thanks @josusky. To clarify, is this the result of DWD GISC harvest of all GISCs? How often is the .tar.gz updated? cc @jsieland

@jsieland
Copy link
Contributor

jsieland commented Jun 4, 2021

To clarify, is this the result of DWD GISC harvest of all GISCs? How often is the .tar.gz updated? cc @jsieland

Yes, it's everything. At https://gisc.dwd.de/wisportal/# (see "Metadata") you can choose between all or only GISC Toulouse or only GISC Offenbach. Should be updated every night.

@josusky
Copy link
Contributor

josusky commented Jun 4, 2021

I am pasting here part of the discussion with Tom over Slack:
When I go to https://gisc.dwd.de/wisportal/# and search for "NWP model GRIB" I get (allegedly) 110978 results. A more machine friendly approach is to use SRU:
https://sru.dwd.de/SRU2JDBC/sru?operation=searchRetrieve&version=1.1&startRecord=1&maximumRecords=5&query=title%20=%20model%20and%20abstract%20=%20NWP&stylesheet=xsl/dwd-sru.xsl&x-dwd-stylesheetDetailLevel=1
this surprisingly finds only 2985 matches (the request definitely needs tuning). Such request could be done from a script. The number of responses per requests is usually limited so we would need to repeat the request.
The other alternative is to run a script locally on top of the catalogue retrieved as the *.tar.gz. That might be faster but requires own implementation of a search algorithm.

@tomkralidis
Copy link
Collaborator Author

Aside: I've captured the WIS catalogue location information in the wiki for future use: https://github.com/wmo-im/wcmp/wiki/WISMetadataCatalogue#wis-10-metadata-catalogue

Please feel free to update.

@tomkralidis
Copy link
Collaborator Author

An initial implementation can be found in https://github.com/wmo-im/pywiscat; @josusky to review/update accordingly, at which point we'll be able to provide an analysis of NWP data.

@antje-s
Copy link

antje-s commented Jun 7, 2021

Different search result numbers occure due to differences in search query
https://sru.dwd.de/SRU2JDBC/sru?operation=searchRetrieve&version=1.1&startRecord=1&maximumRecords=5&query=title%20=%20model%20and%20abstract%20=%20NWP&stylesheet=xsl/dwd-sru.xsl&x-dwd-stylesheetDetailLevel=1
--> title=model AND abstract=NWP

Search over
https://gisc.dwd.de/
"NWP model GRIB"
--> 111 023 matches
searches over all indexed fields for "NWP" OR "model" OR "GRIB"
Search "NWP AND model AND GRIB" (but still over all indexed fields)
--> 24 957 matches

[SolR-supported operators: AND – alternative symbol: &&, NOT – alternative symbol: !, OR – alternative symbol: || [DEFAULT]]

Of interest could possibly also be our new REST API in the WIS Portal. A first version of a doc is available at
https://gisc-test.dwd.de/restapi.html
if you have any questions, we will be happy to help you.

@antje-s
Copy link

antje-s commented Jun 7, 2021

COR: /search/startSearch --> GET

@josusky
Copy link
Contributor

josusky commented Jun 7, 2021

Hi, a first very crude test (looking only for the word "GRIB") yielded the following result (in the JSON output non-ASCII characters are encoded):

{
    "NMC FRANCE - M\u00e9t\u00e9o-France": 1531,
    "NMC UNITED KINGDOM - Met Office": 15069,
    "ECMWF": 2755,
    "GISC Tokyo - Japan Meteorological Agency": 21372,
    "Max-Planck-Institut fuer Meteorologie": 46,
    "Deutscher Wetterdienst": 1246,
    "National Meteorological Information Center, CMA": 126,
    "WMO Lead Centre for Long-Range Forecast Multi-Model Ensemble": 120,
    "FSBE \u00abAviamettelecom of Roshydromet\u00bb": 2182,
    "Deutsches Klimarechenzentrum": 8,
    "Japan Meteorological Agency": 193,
    "OSI SAF": 28,
    "European Centre for Medium-Range Weather Forecasts": 7,
    "Forschungszentrum Karlsruhe": 3,
    "Commonwealth Scientific & Industrial Research Organisation": 4,
    "University of Hohenheim": 2,
    "Canadian Centre for Climate Modelling and Analysis": 2,
    "NOAA": 1,
    "EUMETSAT": 25,
    "H SAF": 14,
    "WMO/WIS/GISC Tokyo": 10,
    "Deutscher Wetterdienst (RD)": 2,
    "Institute for Meteorology, Freie Universit\u00e4t Berlin": 1,
    "DCPC-Adriatic Marine Meteorological Centre": 8,
    "ZAMG - Central Institute for Meteorology and Geodynamics": 2,
    "Agenzia Regionale Preventione e Ambiente dell'Emilia-Romagna": 2,
    "Deutscher Wetterdienst (ZAK)": 2,
    "CNMCA (Pratica di Mare)": 2,
    "NMC BULGARIA - National Institute of Meteorology and Hydrology": 3,
    "University of Toulouse": 2,
    "Max-Planck-Institut fuer Meteorologie (MD)": 5,
    "National Institute for Environmental Studies": 6,
    "Met Office Hadley Centre": 4,
    "Institut f\u00fcr Meteorologie der Freien Universit\u00e4t Berlin": 3,
    "Instituto Nacional de Meteorologia": 1,
    "Centre National de Recherches M\u00e9t\u00e9orologiques": 2,
    "Istituto Superiore per la Protezione e la Ricerca Ambientale (ex APAT)": 2,
    "Met Office": 2,
    "Institute of Atmospheric Sciences and Climate": 2,
    "Meteo-France": 1,
    "Geophysical Fluid Dynamics Laboratory/NOAA": 2,
    "Environment Canada": 2,
    "WMO/WIS/DCPC Tokyo (Global Producing Centre for long-range forecast)": 1,
    "Federal Office of Meteorology and Climatology MeteoSwiss": 2,
    "South East European Virtual Climate Change Center (SEEVCCC)": 3,
    "ARPA-Servizio IdroMeteorologico": 2,
    "Agenzia Regionale per la Protezione dell'Ambiente Ligure": 1,
    "Helmholtz-Zentrum Geesthacht, Zentrum f\u00fcr Material- und K\u00fcstenforschung GmbH": 1,
    "WMO": 1,
    "DCPC Rome (RTH)": 1,
    "Czech hydrometeorological institude": 1,
    "NMC KENYA - Kenya Meteorological Department": 1
}

this shows several potential issues. Several centers are listed more than once, for example, "ECMWF" is obviously the same thing as "European Centre for Medium-Range Weather Forecasts". And there is granularity again. JMA published 21372 records while DWD "only" 1246, but that does not mean that JMA is doing its job by an order of magnitude better :-)

@josusky
Copy link
Contributor

josusky commented Jun 25, 2021

@efucile , sorry for the delay. I have run an updated version of pywiscat (that groups the output by citation authority extracted from the file identifier URI) after our last teleconference but forgot to publish the result. Here it is (this time UTF-8 encoded, thus the non-ASCII characters are more readable):

   "" : {
      "ARPA-Servizio IdroMeteorologico" : 2,
      "Agenzia Regionale Preventione e Ambiente dell'Emilia-Romagna" : 2,
      "Agenzia Regionale per la Protezione dell'Ambiente Ligure" : 1,
      "CNMCA (Pratica di Mare)" : 2,
      "Canadian Centre for Climate Modelling and Analysis" : 2,
      "Centre National de Recherches Météorologiques" : 2,
      "Commonwealth Scientific & Industrial Research Organisation" : 4,
      "Deutscher Wetterdienst" : 1,
      "Deutscher Wetterdienst (RD)" : 2,
      "Deutscher Wetterdienst (ZAK)" : 2,
      "Deutsches Klimarechenzentrum" : 8,
      "Environment Canada" : 2,
      "Federal Office of Meteorology and Climatology MeteoSwiss" : 2,
      "Forschungszentrum Karlsruhe" : 3,
      "Geophysical Fluid Dynamics Laboratory/NOAA" : 2,
      "Helmholtz-Zentrum Geesthacht, Zentrum für Material- und Küstenforschung GmbH" : 1,
      "Institut für Meteorologie der Freien Universität Berlin" : 3,
      "Institute for Meteorology, Freie Universität Berlin" : 1,
      "Institute of Atmospheric Sciences and Climate" : 2,
      "Instituto Nacional de Meteorologia" : 1,
      "Istituto Superiore per la Protezione e la Ricerca Ambientale (ex APAT)" : 2,
      "Max-Planck-Institut fuer Meteorologie" : 46,
      "Max-Planck-Institut fuer Meteorologie (MD)" : 5,
      "Met Office Hadley Centre" : 4,
      "National Institute for Environmental Studies" : 6,
      "University of Hohenheim" : 2,
      "University of Toulouse" : 2,
      "ZAMG - Central Institute for Meteorology and Geodynamics" : 2
   },
   "cn.cma.wmc-bj" : {
      "National Meteorological Information Center, CMA" : 126
   },
   "cz.chmi.dcpc" : {
      "Czech hydrometeorological institude" : 1
   },
   "de.dwd.gpc" : {
      "Deutscher Wetterdienst" : 16
   },
   "fr.meteo" : {
      "NMC FRANCE - Météo-France" : 369
   },
   "fr.meteo.dcpc-copernicus" : {
      "ECMWF" : 905
   },
   "fr.meteo.dcpc-eer" : {
      "NMC FRANCE - Météo-France" : 7
   },
   "fr.meteo.dcpc-lrf" : {
      "NMC FRANCE - Météo-France" : 2
   },
   "fr.meteo.dcpc-nwp" : {
      "NMC FRANCE - Météo-France" : 64
   },
   "hr.ammc.dcpc" : {
      "DCPC-Adriatic Marine Meteorological Centre" : 8
   },
   "int.ecmwf" : {
      "ECMWF" : 116,
      "European Centre for Medium-Range Weather Forecasts" : 7
   },
   "int.eumetsat" : {
      "ECMWF" : 2,
      "EUMETSAT" : 25,
      "H SAF" : 14,
      "Met Office" : 2,
      "Meteo-France" : 1,
      "NOAA" : 1,
      "OSI SAF" : 28,
      "WMO" : 1
   },
   "int.wmo.wis" : {
      "Deutscher Wetterdienst" : 1229,
      "ECMWF" : 1732,
      "FSBE «Aviamettelecom of Roshydromet»" : 2182,
      "GISC Tokyo - Japan Meteorological Agency" : 21356,
      "Japan Meteorological Agency" : 193,
      "NMC BULGARIA - National Institute of Meteorology and Hydrology" : 3,
      "NMC FRANCE - Météo-France" : 1089,
      "NMC KENYA - Kenya Meteorological Department" : 1,
      "NMC UNITED KINGDOM - Met Office" : 15069
   },
   "it.meteoam.dcpc" : {
      "DCPC Rome (RTH)" : 1
   },
   "jp.go.jma.wis.dcpc-geogr" : {
      "GISC Tokyo - Japan Meteorological Agency" : 16
   },
   "jp.go.jma.wis.dcpc-gpc" : {
      "WMO/WIS/DCPC Tokyo (Global Producing Centre for long-range forecast)" : 1,
      "WMO/WIS/GISC Tokyo" : 5
   },
   "jp.go.jma.wis.dcpc-sat" : {
      "WMO/WIS/GISC Tokyo" : 1
   },
   "jp.go.jma.wis.dcpc-tcc" : {
      "WMO/WIS/GISC Tokyo" : 4
   },
   "org.wmolc" : {
      "WMO Lead Centre for Long-Range Forecast Multi-Model Ensemble" : 120
   },
   "rs.gov.hidmet" : {
      "South East European Virtual Climate Change Center (SEEVCCC)" : 3
   }
}

@amilan17 amilan17 added this to the noTargetMilestone milestone Apr 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

7 participants