Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] HDF5 / NetCDF support #7947

Open
CaptainSifff opened this issue Jun 16, 2021 · 7 comments
Open

[Feature Request] HDF5 / NetCDF support #7947

CaptainSifff opened this issue Jun 16, 2021 · 7 comments
Labels
Feature: File Upload & Handling pm.netcdf-hdf5.d All 3 aims are currently under this deliverable Type: Suggestion an idea User Role: Depositor Creates datasets, uploads data, etc.

Comments

@CaptainSifff
Copy link

CaptainSifff commented Jun 16, 2021

Since Dataverse already has support for the domain specific format *.fits. Would it be possible to add support for HDF5
(https://www.hdfgroup.org/solutions/hdf5/).
The file format is very flexible so a first step would be to just support the tabular-like usage as it is done in NetCDF4(https://en.wikipedia.org/wiki/NetCDF)

Thinking a bit forward, I would like to be able to browse the content of the container like I can browse the files that make up a dataset.

@CaptainSifff
Copy link
Author

As a side-note there are also other standardized formats that include metadata that are built on top of HDF5:
https://www.nexusformat.org/

@pdurbin
Copy link
Member

pdurbin commented Oct 3, 2022

@CaptainSifff thanks for creating this issue.

a first step would be to just support the tabular-like usage as it is done in NetCDF4

@atrisovic just pointed me toward an example of a .nc4 file at https://github.com/energy-policy-institute-uchicago/xarray-notebooks/blob/master/xarray-basics.ipynb

It has a nice diagram of the file format:

netcdf

Related:

@pdurbin
Copy link
Member

pdurbin commented Nov 28, 2022

@CaptainSifff I was just talking to @atrisovic about your ideas and we'd like to interview you! 😄

When you have a minute can you please pop in https://chat.dataverse.org so we can schedule a time? Thanks!

@pdurbin
Copy link
Member

pdurbin commented Feb 14, 2023

@CaptainSifff thanks for meeting with me and @atrisovic a while back! We recently published a NetCDF/HDF5 design doc and we'd love your feedback!

https://docs.google.com/document/d/1Ax_sMdgx5ROkIBA7-IC4_hySvgXkk6O8qTZLIvWWnqE/edit?usp=sharing

We'd also love feedback from others reading this. Thanks!

@mreekie mreekie added the pm.netcdf-hdf5.d All 3 aims are currently under this deliverable label Mar 30, 2023
@mreekie
Copy link

mreekie commented Mar 30, 2023

grooming

  • Looking at the NetCDF deliverables.
  • This looks like it's a related issue that is related to active work, but may not be directly actionable
  • Not going to add it to the queue, but tagging it.

@CaptainSifff
Copy link
Author

@pdurbin I looked over the google docs and You are making great progress there, and I think that the geosciences will appreciate this effort. I currently think that it's a bit netCDF centric, but that's OK for a start. The next stop would likely be sth. like the nexus format from above that utilizes HDF5.

@pdurbin
Copy link
Member

pdurbin commented May 16, 2023

@CaptainSifff thanks for the feedback.

With regard to HDF5, I'm not sure how closely you've been following @JR-1991 's amazing work on incorporating H5Web as an external tool with Dataverse:

I just tried it with a random Nexus file I found at https://github.com/nexusformat/exampledata/blob/eae516807ef7e27d1c45aab3af3a64a679154677/IPNS/LRMECS/hdf5/lrcs3701.nx5

Here's how it looks:

Screen Shot 2023-05-16 at 10 28 43 AM

At the moment anyway, you can play around with it here: https://dev1.dataverse.org/file.xhtml?fileId=1069&version=1.0

We've been talking about H5Web here: https://dataverse.zulipchat.com/#narrow/stream/376593-geospatial/topic/plot.20arrays.20from.20HDF5

I hope H5Web helps a bit with what you were saying at the start: "I would like to be able to browse the content of the container like I can browse the files that make up a dataset."

Dataverse treats that Nexus file as a HDF5 file, which means the NcML preview is shown as well

Screen Shot 2023-05-16 at 10 28 55 AM

While I'm writing, @CaptainSifff how do you feel about closing this issue now there there is at least a little HDF5 and NetCDF support in Dataverse? You can preview our docs for our upcoming 5.14 release at https://preview.guides.gdcc.io/en/develop/user/dataset-management.html#netcdf-and-hdf5 and here's a screenshot:

Screen Shot 2023-05-16 at 10 37 56 AM

My thinking is that you (and others) can create smaller issues about adding this or that feature (maybe Nexus support). This works best for us because we can better estimate small issues to be included in a sprint.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature: File Upload & Handling pm.netcdf-hdf5.d All 3 aims are currently under this deliverable Type: Suggestion an idea User Role: Depositor Creates datasets, uploads data, etc.
Projects
None yet
Development

No branches or pull requests

3 participants