Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fields dump load #1674

Closed
wants to merge 6 commits into from
Closed

Conversation

kkg4theweb
Copy link
Contributor

First cut of a PR to add support for dumping and (re)loading the 'fields' state.

src/fields_dump.cpp Outdated Show resolved Hide resolved
src/fields_dump.cpp Outdated Show resolved Hide resolved
@codecov-commenter
Copy link

Codecov Report

Merging #1674 (80ebafb) into master (89c9349) will decrease coverage by 0.02%.
The diff coverage is 50.00%.

@@            Coverage Diff             @@
##           master    #1674      +/-   ##
==========================================
- Coverage   73.20%   73.17%   -0.03%     
==========================================
  Files          13       13              
  Lines        4515     4519       +4     
==========================================
+ Hits         3305     3307       +2     
- Misses       1210     1212       +2     
Impacted Files Coverage Δ
python/simulation.py 75.96% <50.00%> (-0.05%) ⬇️

@stevengj
Copy link
Collaborator

stevengj commented Jul 15, 2021

It's fine to write everything to separate files (one per process, say). You'll want to do a couple of things:

  • Modify save_dft_hdf5 to take an additional flag single_parallel_file. If that's false, simply skip the parallel reductions in dft_chunks_Ntotal.
  • In your own code, follow the same two-pass structure: first compute the total size of the data on your process, then create the dataset, then write the chunks into that dataset. Again, if single_parallel_file==true we can do an optional parallel reduction after the first pass, and otherwise the code is the same.

Basically, to use HDF5 parallel I/O with a single file you have to create the file and create the dataset collectively, and then each process can call write_chunk as many times as it wants to write its own data into the file (as long as it writes into non-overlapping portions of the file). That's why the save_dft_hdf5 and structure::dump functions are organized the way they are.

@stevengj
Copy link
Collaborator

stevengj commented Jul 15, 2021

Possible Python API: to re-start a simulation, run the same simulation script, creating the Simulation object as usual with all of the geometry and sources etcetera, but before you do sim.run, first run a sim.load(config) function (that loads the fields, DFT, etcetera).

config would be some kind of argument that says what to load. Maybe a JSON file, or just a dict, whose fields give the various filenames? Alternatively, it could be the name "foo.h5" of a single HDF5 file that contains everything (which maybe in your case is transformed to some other name, e.g. foo__N.h5 for each process N).

@stevengj
Copy link
Collaborator

Closed by #1738?

@stevengj stevengj closed this Sep 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants