Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Use AxisArray for abundances.grid #54

Closed
wants to merge 8 commits into from
Closed

WIP: Use AxisArray for abundances.grid #54

wants to merge 8 commits into from

Conversation

sdl1
Copy link
Member

@sdl1 sdl1 commented Jun 16, 2020

Possible solution to ScottishCovidResponse/SCRCIssueTracking#281

This makes EpiLandscape.grid a multi-D AxisArray view onto EpiLandscape.matrix.

Example usage is in Scotland_run.jl. It sets up an initial_population which is an AxisArray of the appropriate dimension and axes. All the initial setting of infected individuals etc is done on that array, which is then used to populate the initial matrix / grid.

Advantages:

  • Directly use the AxisArray data coming from parse_hdf5 (e.g. don't have to re-write all the age categories)
  • Greatly simplifies interaction with the abundances matrix, makes it more extensible and less error prone. E.g. setting the initial exposed individuals is changed from
age_and_cells = Iterators.product(1:age_categories,
                                  1:size(epi.abundances.matrix, 2))
# Take all susceptibles of each age per cell
pop_weights = epi.abundances.matrix[vcat(cat_idx[:, 1]...), :]
# It would be nice if it wasn't necessary to call collect here
N_cells = size(epi.abundances.matrix, 2)
samp = sample(collect(age_and_cells), weights(1.0 .* vec(pop_weights)), 100)
age_ids = getfield.(samp, 1)
cell_ids = getfield.(samp, 2)

for i in eachindex(age_ids)
    if (epi.abundances.matrix[cat_idx[age_ids[i], 1], cell_ids[i]] > 0)
        # Add to exposed
        epi.abundances.matrix[cat_idx[age_ids[i], 2], cell_ids[i]] += 1
        # Remove from susceptible
        epi.abundances.matrix[cat_idx[age_ids[i], 1], cell_ids[i]] -= 1
    end
end

to the following (could be even shorter if various AxisArray issues were fixed)

susc = @view initial_pop[class=:Susceptible]
exposed = @view initial_pop[class=:Exposed]
# Weight samples by number of susceptibles
samples = sample(CartesianIndices(susc), weights(1.0 .* vec(susc)), 100)
for samp in samples
    susc[samp] > 0 || continue
    # Add to exposed
    exposed[samp] += 1
    # Remove from susceptible
    susc[samp] -= 1
end
  • Makes plotting nicer, since we can "know" about the dimensions to sum over

Disadvantages:

  • AxisArrays is a bit flaky, it has a bunch of issues. So far I didn't find anything insurmountable...
  • Seems to allocate more, I'm not sure why (might be able to improve this). On the Scotland example, @btime simulate_record!:
dev:
  5.525 s (3881312 allocations: 205.22 MiB)

this PR:
  5.946 s (7612784 allocations: 428.20 MiB)
  • Only addresses abundances, doesn't address other things like the transition matrix, etc

@sdl1
Copy link
Member Author

sdl1 commented Jun 19, 2020

Update, I figured out the allocation issue, it was due to a small bug I introduced in the output code. There is now negligible difference from dev

dev:
5.451 s (3881312 allocations: 205.22 MiB)
this PR:
5.465 s (3885216 allocations: 207.39 MiB)

@richardreeve
Copy link
Member

@sdl1 Do you want to resolve the conflicts in this, and then hopefully we can get it merged?

@sdl1
Copy link
Member Author

sdl1 commented Jun 23, 2020

Yep currently going through those. I also need to fix the tests and the other example scripts

@sdl1
Copy link
Member Author

sdl1 commented Jun 23, 2020

As part of this I'm considering making it so that the initial_population is specified as an argument to the EpiSystem constructor, rather than the EpiEnv constructor. I think this is conceptually more correct, and will also make it a lot easier to keep the ability to not specify an initial population explicitly (which is turning out to be a bit tricky in this PR). Does this sound ok @claireh93 ?

@claireh93
Copy link

@sdl1 that sounds good! I think there might be a few things that the EpiEnv relies upon getting from initial_population though, like active grid squares and shrink to fit etc, but I guess those things can happen in the EpiSystem constructor instead?

@sdl1
Copy link
Member Author

sdl1 commented Jun 23, 2020

Yes that should be possible either there or directly on the initial_population array at the beginning.

@jwscook
Copy link

jwscook commented Jul 23, 2020

@sdl1 Can you provide a brain dump of things that need to be done both on the Simulation.jl end and the AxisArrays.jl to get this working even if they are blocking problems on either side? Cheers!

@sdl1
Copy link
Member Author

sdl1 commented Jul 27, 2020

Closing this since it is out of date and requires more work than I originally realised. Below is a general list of things which ideally need to be done to make this work nicely:

  • Update this PR to work with current dev
  • Fix Dot-assignment fails with MethodError: no method matching dotview JuliaArrays/AxisArrays.jl#185
    • Would allow things like initial_pop[class=:Susceptible][samp] .= 0.0 rather than having to create a view first
  • Fix Reductions fail with Unitful axes JuliaArrays/AxisArrays.jl#113
    • Would allow doing reductions directly on the AxisArray when we have Unitful axes (which is almost always)
  • Should probably fix Select by axis values doesn't preserve the order JuliaArrays/AxisArrays.jl#182 because I can imagine it potentially leading to bugs
  • To reshape correctly between N-D grid and 2D matrix, we need to ensure that the dimensions are ordered correctly. Need to figure out how to guarantee this. This is important because otherwise things may run ok but produce nonsense results.
  • Need to figure out how to automatically shrink the grid when it's specified as an N-D array. Or (I think I prefer this) remove the automatic shrinking code, and just make the input HDF5 file cover the right extent.
  • Need to figure out if we want to continue to support older constructors
  • Should factor out the construction of initial_pop into a re-usable function
  • Consider outputting the N-D grid rather than the 2D matrix. Since it is more interpretable and easier to reconstruct.

@sdl1 sdl1 closed this Jul 27, 2020
richardreeve added a commit that referenced this pull request Apr 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants