Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Heap profiling for memory bottlenecks discovery #113

Closed
CPerezz opened this issue Dec 23, 2022 · 5 comments
Closed

Heap profiling for memory bottlenecks discovery #113

CPerezz opened this issue Dec 23, 2022 · 5 comments
Assignees
Labels
enhancement New feature or request

Comments

@CPerezz
Copy link
Member

CPerezz commented Dec 23, 2022

Use ad-hoc heap allocation profiling to see how memory is being consumed and how can we optimize it.

See https://docs.rs/dhat/latest/dhat/ docs for more info

@CPerezz CPerezz added the enhancement New feature or request label Dec 23, 2022
@CPerezz
Copy link
Member Author

CPerezz commented Feb 2, 2024

I think @ed255 actually did this and saw the place where memory spikes the most (when we hold Evaluation & Coefficient form of all our polynomials).

And IIRC, I saw that we can't really avoid that (although we can revisit ofc).

@ed255 Can you maybe post a summary here such taht we can trigger a discussion on how/if is worth solving this?

@ed255
Copy link
Member

ed255 commented Feb 5, 2024

I used dhat to profile memory usage and make sure that the frontend-backend split didn't increase the memory consumption compared to the legacy halo2. My impressions of the profiler were:

  • Positive: It's very easy to use, you just import it and create the profiler variable and that's it.
  • Negative: Sometimes the report doesn't have the full stack trace of an allocation

I didn't analyze the profiling results to see which part of halo2 was using more memory, I was only paying attention to the comparison between frontend-backend split and legacy halo2; nevertheless on a different occasion I tried to reason about the biggest source of memory usage from halo2 and here are my conclusions

Constants:

  • M: number of advice columns
    • NOTE: This number will be bigger from implicit polynomials required for copy constraints and lookups; so it's more accurate if we say M is the number of committed polynomials.
  • n: number of rows (2^k)

Once the circuit is big enough, this is the biggest source of memory usage:

  1. In proof creation we hold the witness matrix in memory. This is M * n * sizeof(Assigned<F>)
  2. We convert this to M * n * sizeof(F) in the batch_invert_assigned step
    • This reduces the size, because Assigned<F> is ~ 2 * sizeof(F)
  3. Then these vectors are committed, to do so they are extended 2^j times (where j depends on the max expression degree of the circuit). I think the extended form is not kept in memory after commitment, but I'm not sure.
  4. After all commitments are done, we perform an evaluation, which needs all the data from point (2).

From a discussion with @han0110 , he mentioned that Scroll did a trick to cap the max memory usage by: "computing the evaluations on extended domain and quotient chunk by chunk"; but I don't fully understand this idea 😅

@jonathanpwang
Copy link

Nice write up!

I believe the optimization Han mentioned is scroll-tech#28 (comment)
We have also observed it greatly reduces memory usage.

@CPerezz
Copy link
Member Author

CPerezz commented Feb 8, 2024

Nice write up!

I believe the optimization Han mentioned is scroll-tech#28 (comment) We have also observed it greatly reduces memory usage.

Working on porting this here! Trying to get good benchmarks to make avaliable with the PR.

@CPerezz CPerezz self-assigned this Feb 8, 2024
iquerejeta pushed a commit to input-output-hk/halo2 that referenced this issue May 8, 2024
…-hk/dev-fix/double-add-docs

Resolve "Add documentation for Double-and-Add"
@davidnevadoc
Copy link

Superseded by #291

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants