Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid excess allocations in buildUpdateMap #757

Merged
merged 1 commit into from
Mar 20, 2024
Merged

Avoid excess allocations in buildUpdateMap #757

merged 1 commit into from
Mar 20, 2024

Conversation

mattrjackson
Copy link
Contributor

First of all, thanks for an excellent library. I use TASMANIAN on almost a daily basis for a combination of uncertainty quantification and surrogate dataset generation.

When building larger surrogate datasets, I noticed runtimes grew rapidly compared to the number of points added during surplus refinement. For some of my faster running models, TASMANIAN was accounting for 99%+ of the runtime, and most of that was spent in buildUpdateMap.

I profiled the code and noticed that the vast majority of the calculations were spent allocating std::vector<int>, which I tracked down to the allocation of global_to_pnts inside the parallel loop. This PR significantly reduces the number of allocations by moving global_to_pnts outside of the parallel loop and allocating them for each thread using a firstprivate clause. On my laptop, this has improved performance for grids with millions of points by at least an order of magnitude.

@mkstoyanov
Copy link
Collaborator

Hey @mattrjackson

It's always great to hear from actual users, especially ones that have deployed Tasmanian in production.

I know Tasmanian has a not insignificant community, but only a very small group of people interact with me on regular basis. You're the first person that I know and is using the direction selective refinement, since I developed the method a while back. Thus the update builder hasn't received the proper attention.

Thanks for the PR and I think I see a few other opportunities to make improvements. Roughly speaking, what is the dimension and number of point for the expensive problem that you're solving? I want to run a few benchmarks myself to make sure I actually make improvements and not regressions.

@mkstoyanov mkstoyanov merged commit ef62603 into ORNL:master Mar 20, 2024
8 checks passed
@mattrjackson
Copy link
Contributor Author

For the surrogate datasets, I've tried up to six dimensions and about 10M points thus far. That was initially due to the limitation of the model we were calling (which is developed by a third party and unfortunately is not thread-safe). After making a large number of optimizations to that model I can now afford to run tens of millions of calculations if need be, but how large the grids grow from here are a bit of a question mark.

As always, fewer points would be better, but the data from the model has sharp discontinuities in some spots, so preserving the features that need to be there without blowing up the grid size has proven challenging.

@mkstoyanov
Copy link
Collaborator

There are fundamental mathematical challenges when dealing with discontinuous response. Basically, you cannot have a convergence in the "inf" norm between the model and the interpolant, which means that the surpluses do not decay to zero, which in turn blows up the size of the grid.

The first question that you need to ask is what does it mean to approximate the discontinuous model, while understanding that there will be an area near the discontinuity that will have significant difference.

The second thing that you can try is to force the convergence (and reduce the blowup of points) with added scale_correction. The correction is just a multiplier to the surpluses, if you correct by zero then no refinement will happen near that particular point, if you set it to a small number, then refinement will happen only if the surplus is significantly larger than the tolerance. Using a correction of a large number will force refinement even if the surplus is small.

One good correction choice is the area (support) of the hierarchical basis functions. You can get the support from getHierarchicalSupport()

We can setup a Zoom/Teams call next week to discuss some of the details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants