Skip to content

franckalbinet/marisco

Repository files navigation

MARISCO

The IAEA Marine Radioactivity Information System (MARIS) provides open access to radioactivity measurements in marine environments. Developed by the IAEA Marine Environmental Laboratories in Monaco, MARIS offers data on seawater, biota, sediment, and suspended matter.

This Python package includes command-line tools to convert MARIS datasets into NetCDF or .csv formats, enhancing compatibility with various scientific and data analysis software.

Core Concept: Handlers

marisco is built around the concept of handlers - specialized modules designed to convert MARIS datasets into NetCDF format. Each handler is tailored to a specific data provider and implemented as a dedicated Jupyter notebook.

Literate Programming Approach

We’ve adopted a Literate Programming approach, which means:

  1. Documentation: Each handler serves as comprehensive documentation.
  2. Code Reference: The notebooks contain the actual implementation code.
  3. Communication Tool: They facilitate discussions with data providers about discrepancies or inconsistencies.

Powered by nbdev

To achieve this, we leverage nbdev, a powerful tool that allows us to:

  1. Write code within Jupyter notebooks
  2. Automatically export relevant parts as dedicated Python modules

This approach bridges the gap between documentation and implementation, ensuring they remain in sync.

See It in Action

For a concrete example of this approach, check out our HELCOM dataset handler implementation.

Please note that this project is still under development.

We have implemented the MARIS Legacy handler to convert all existing datasets from the MARIS master database into NetCDF format. For datasets that are frequently updated, such as HELCOM, OSPAR, and TEPCO/Fukushima-related datasets, individual handlers are currently being developed and will be available soon.

Install

Now, to install marisco simply run

pip install marisco

Once successfully installed, run the following command:

maris_init

This command:

  1. creates a .marisco/ directory containing various configuration/configurable files ((below)) in your /home directory;
  2. creates a configs.toml file containing default but configurable settings (default paths, …);
  3. downloads several MARIS DB nomenclature/lookup table into .marisco/lut/ directory;
  4. downloads maris-template.nc, the MARIS NetCDF4 template.

Zotero API key

Upon conversion, marisco will automatically retrieve the bibliographic metadata of each MARIS dataset from Zotero. To do so, you need to define the following environment variable ZOTERO_API_KEY containing the MARIS Zotero API key. Please contact the MARIS team to get your API key.

Getting started

Command line utilities

All commands accept a -h argument to get access to its documentation.

maris_init

Donwload configuration file, NetCDF MAris template and required lookup tables (nomenclatures).

maris_netcdfy

Encode MARIS dataset as NetCDF

Positional arguments:

  • handler_name: Handler’s name (e.g helcom, …)
  • str: Path to dataset to encode
  • dest: Path to converted NetCDF4

Example:

maris_netcdfy helcom _data/accdb/mors/csv _data/output/helcom.nc

Development

The MARIS NetCDF template is generated from nbs/files/cdl/maris.cdl Common Data Language (CDL) file as defined by Unidata. To generate the MARIS NetCDF template nbs/files/nc/maris-template.nc, install the NetCDF-C utilities, once in Marisco home directory, run:

ncgen -4 -o nc/maris-template.nc cdl/maris.cdl