Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

include discussion relating to arXiv:1411.0507 -- Is HDF5 a good format to replace UVFITS? #1

Open
astronomeralex opened this issue Nov 4, 2014 · 5 comments

Comments

@astronomeralex
Copy link

I saw this paper on the arXiv and thought it should be included in our discussions in this paper.

http://arxiv.org/abs/1411.0507

@timj
Copy link
Contributor

timj commented Nov 5, 2014

That's by @telegraphic

See also http://arxiv.org/abs/1411.0996 which summarises the ADASS BoF.

@telegraphic
Copy link

Hi @astronomeralex, hi @timj

The fits2hdf work we've done so far has mainly been proof-of-concept. Our hope is that having a concrete example to analyze and critique will be beneficial for the longer term community discussion (for which it looks like this A&C paper will be an important cornerstone!). It would be great to get some feedback on the implementation if anyone feels like playing around with it.

Our motivation so far has been for our own selfish short-term data storage needs, but are hoping it is more widely useful. We are planning on doing some performance benchmarks and looking into compression (FITS rice vs. HDF5 options) in more detail, which should be interesting; it could turn out that there are some nasty bottlenecks or limits that we don't yet realize.

I guess in the terms of the BoF paper, we are most interested in using HDF5 as a recording format. As the BoFs noted, HDF5 isn't going to be a good archive format unless there's a more rigid internal data model that the community agrees upon. The HDFITS model we've implemented in fits2hdf is limited, but its likeness to FITS might ease the transition to a more beautiful data model. HDF5 isn't really an in-memory format, so probably isn't the best choice for pipeline/processing format by itself.

Indeed, as noted in the BoF, we should all be careful of making a distinction between a file format and a data model -- if done carefully there's no need to be wedded specifically to HDF5, or even the concept of a 'file' in general. Having said that, HDF5 exists now and is immediately practical, hence the work we've been doing toward using it.

Looking forward to seeing this progress, and would be happy to contribute in any way I can, from the large-N radio telescope instrumentation perspective.

Regards
Danny

@timj
Copy link
Contributor

timj commented Nov 6, 2014

@telegraphic I've added you to the astrodataformat organization.

@embray
Copy link

embray commented Nov 6, 2014

I heard about this project from some of my friends who were at ADASS and were impressed by it. I feel like it would also be great if some of this work can be fed back into Astropy which also has HDF5 <=> FITS conversion capabilities, at least for tables (and for arrays, eventually).

@timj
Copy link
Contributor

timj commented Nov 6, 2014

I sent @telegraphic an HDF5 file that was generated by the Starlink CONVERT package fits2ndf command (now that I can make the Starlink software use HDF5 behind the scenes). Obviously there are many ways to map FITS to a hierarchical format. It's not obvious we can converge as the NDF data model that CONVERT uses for the output HDF5 file has a lot of data model stuff in it which may be a step too far for a different conversion tool with a different audience (although if the tool does match the NDF data model then all the Starlink software would work with it without any changes).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants