Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Very Low Memory Mapmaking Support #684

Open
tskisner opened this issue Aug 2, 2023 · 3 comments
Open

Very Low Memory Mapmaking Support #684

tskisner opened this issue Aug 2, 2023 · 3 comments
Assignees

Comments

@tskisner
Copy link
Member

tskisner commented Aug 2, 2023

Currently the MapMaker operator assumes that the entire detector timestream data volume is in memory. However, with a small change this could support making maps of data while loading one observation at a time and making several passes through the data. This issue outlines the design of that:

  • The starting Data object only needs to have the telescope pointing information and other auxiliary information for all observations.
  • Add loader and preprocess traits to the MapMaker class which are optional class traits. The loader operator should populate detector data in observations one at a time when its exec() method is called, and also provide "rewind()" and "purge()" (or similar) methods.
  • For testing, a simple loader operator can be created that just loops over an existing Data object.
  • The loader and preprocess traits are passed to the lower-level operators.
  • When building the RHS of the template solver, this can be accumulated with 2 passes through the data.
  • When solving for the template amplitudes, one detector at a time is processed as usual.
  • When making the final binned map, one additional pass is made through the data.
  • In both the RHS construction and the final binning, the original preprocess operator is passed to the existing preprocess trait of the binning operator.
  • When running a filter and bin workflow, a Pipeline containing the filtering operators can be passed as the preprocess trait.
@tskisner tskisner self-assigned this Aug 2, 2023
@Ankurdev-astro
Copy link

This would be an excellent support feature. Is there also a feature/ planned feature to write the detector timestream data to Disk instead of holding all in memory, such that the user can also make maps later from data on disk one observation at a time?

@tskisner
Copy link
Member Author

For simulated data, one can already use the SaveHDF operator to write per-observation HDF5 files with the contents from memory. For real data, there can sometimes be overhead loading data from disk due to file formats (for example, unpacking frame-based data or re-ordering data that is stored as detector values for each sample). In that case it might also be desirable to load raw data and write to HDF5 if the intention is to repeatedly read data from disk.

@Ankurdev-astro
Copy link

That's great! thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants