Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: Remove ValueIds from SSA #6850

Closed
wants to merge 38 commits into from
Closed

chore: Remove ValueIds from SSA #6850

wants to merge 38 commits into from

Conversation

jfecher
Copy link
Contributor

@jfecher jfecher commented Dec 17, 2024

Description

Problem*

Summary*

Removes ValueIds from SSA to reduce memory usage and hopefully improve compiler performance as well.

  • ValueIds are replaced by the Value enum itself, which is now thinner and implements Copy.
  • Several functions for iterating over values (block_parameters, instruction_results) now no longer require references to the DFG, so we don't need as much cloning in several passes.
  • Related to the previous point, values can be created much more often without referencing the DFG at all now, e.g. Value::constant just needs a field and a type since we don't need the separate map to keep each valueid unique to each constant value any more. Similarly, even Value::Instruction only needs an instruction id and a position. No need to store instruction id values in an external map in the dfg. This means we can get rid of quite a few fields on the DFG to save memory.
  • DFG::set_value_from_id can no longer mutate the value a ValueId points to, and instead relies solely on the internal map between values.

Additional Context

The integration tests are currently all passing, but there is a stack overflow somewhere within the test runner when running the unit tests, in between running the actual tests.

I have not started migrating all the SSA parser tests in the evaluator, these are expected to be failing currently since the display of valueids has changed.

Documentation*

Check one:

  • No documentation needed.
  • Documentation included in this PR.
  • [For Experimental Features] Documentation to be submitted in a separate PR.

PR Checklist*

  • I have tested the changes locally.
  • I have formatted the changes with Prettier and/or cargo fmt on default settings.

Copy link
Contributor

github-actions bot commented Dec 18, 2024

Changes to Brillig bytecode sizes

Generated at commit: aedd79c6ac9e858c6678599c1f3a4de7457c7039, compared to commit: ee0754b1c6b36961c180901db59dd593c183de77

🧾 Summary (10% most significant diffs)

Program Brillig opcodes (+/-) %
to_le_bytes +10 ❌ +8.20%
brillig_nested_arrays -17 ✅ -9.34%

Full diff report 👇
Program Brillig opcodes (+/-) %
to_le_bytes 132 (+10) +8.20%
hashmap 21,810 (-3) -0.01%
uhashmap 14,016 (-3) -0.02%
sha256_regression 6,917 (-3) -0.04%
array_dynamic_nested_blackbox_input 894 (-7) -0.78%
array_sort 293 (-3) -1.01%
nested_array_in_slice 1,201 (-31) -2.52%
nested_array_dynamic 2,349 (-108) -4.40%
brillig_nested_arrays 165 (-17) -9.34%

Copy link
Contributor

github-actions bot commented Dec 18, 2024

Changes to number of Brillig opcodes executed

Generated at commit: aedd79c6ac9e858c6678599c1f3a4de7457c7039, compared to commit: ee0754b1c6b36961c180901db59dd593c183de77

🧾 Summary (10% most significant diffs)

Program Brillig opcodes (+/-) %
nested_array_dynamic -72 ✅ -2.08%
brillig_nested_arrays -12 ✅ -7.74%

Full diff report 👇
Program Brillig opcodes (+/-) %
uhashmap 145,608 (-3) -0.00%
sha256_regression 128,310 (-3) -0.00%
hashmap 54,202 (-3) -0.01%
array_dynamic_nested_blackbox_input 4,724 (-6) -0.13%
array_sort 523 (-3) -0.57%
nested_array_in_slice 1,540 (-24) -1.53%
nested_array_dynamic 3,392 (-72) -2.08%
brillig_nested_arrays 143 (-12) -7.74%

Copy link
Contributor

github-actions bot commented Dec 18, 2024

Changes to circuit sizes

Generated at commit: aedd79c6ac9e858c6678599c1f3a4de7457c7039, compared to commit: ee0754b1c6b36961c180901db59dd593c183de77

🧾 Summary (10% most significant diffs)

Program ACIR opcodes (+/-) % Circuit size (+/-) %
conditional_1 +7,941 ❌ +88.68% +11,484 ❌ +62.87%
regression +174 ❌ +54.55% +234 ❌ +6.07%

Full diff report 👇
Program ACIR opcodes (+/-) % Circuit size (+/-) %
conditional_1 16,896 (+7,941) +88.68% 29,749 (+11,484) +62.87%
regression 493 (+174) +54.55% 4,088 (+234) +6.07%
sha256_var_size_regression 22,481 (+512) +2.33% 80,526 (+656) +0.82%
hashmap 53,898 (+80) +0.15% 119,241 (+160) +0.13%
regression_5252 32,919 (+3) +0.01% 44,528 (+6) +0.01%
sha2_byte 15,938 (-2) -0.01% 82,672 (-3) -0.00%
sha256 2,118 (-2) -0.09% 25,415 (-3) -0.01%
array_dynamic_nested_blackbox_input 251 (-2) -0.79% 7,324 (-2) -0.03%
nested_array_in_slice 1,026 (-12) -1.16% 5,571 (-18) -0.32%
nested_array_dynamic 3,484 (-30) -0.85% 12,990 (-43) -0.33%

Copy link
Contributor

github-actions bot commented Dec 18, 2024

Peak Memory Sample

Program Peak Memory %
keccak256 75.84M -4%
workspace 123.46M 0%
regression_4709 393.70M -8%
ram_blowup_regression 2.34G 100%
private-kernel-tail 191.47M -8%
private-kernel-reset 613.88M -15%
private-kernel-inner 265.05M -10%
parity-root 162.11M -6%

Copy link
Contributor

github-actions bot commented Dec 18, 2024

Compilation Sample

Program Compilation Time %
sha256_regression 0m1.356s -7%
regression_4709 0m0.793s -3%
ram_blowup_regression 0m18.656s 23%
rollup-base-public 3m5.800s -12%
rollup-base-private 2m56.996s -6%
private-kernel-tail 0m1.139s -3%
private-kernel-reset 0m8.381s 13%
private-kernel-inner 0m2.034s -27%
parity-root 0m0.926s -7%
noir-contracts 2m52.240s 4%

@jfecher
Copy link
Contributor Author

jfecher commented Dec 18, 2024

Initial results are very ungood, I think this is because FieldElement is 256bits causing Value to be 320 bits.
I'll re-add constant ids which should bring Value down to 96 bits. If we could embed the enum tag in the value somehow we could get it to 64.

@TomAFrench TomAFrench linked an issue Dec 18, 2024 that may be closed by this pull request
Copy link
Contributor

github-actions bot commented Dec 18, 2024

Execution Sample

Program Execution Time %
sha256_regression 0m0.640s 1%
regression_4709 0m0.400s 1%
ram_blowup_regression 0m4.479s 2%
rollup-base-public 0m31.034s 10%
rollup-base-private 0m28.929s 10%
private-kernel-tail 0m0.755s 0%
private-kernel-reset 0m1.574s 7%
private-kernel-inner 0m0.968s -3%
parity-root 0m0.537s -3%

@jfecher
Copy link
Contributor Author

jfecher commented Dec 18, 2024

Value's size has been reduced, although I'm unsure why the changes aren't more dramatic. I'd think removing the instruction_results map alone would have been a decent improvement. We're also cloning Vecs less throughout. I think I'll have to profile this more before fixing the ssa parser but as is this may not be worth merging with the regression in ram_blowup_regression.

@TomAFrench
Copy link
Member

Potentially an increased size of the replaced values map due to holding the actual values now?

@jfecher
Copy link
Contributor Author

jfecher commented Dec 19, 2024

Yes, but even then the increase is only from 32 to 64 bits which I'd expect to be offset by the other removals in this PR.

@jfecher
Copy link
Contributor Author

jfecher commented Dec 19, 2024

From manually timing programs, it seems this PR is about ~5-8% faster compiling programs compared to master

@jfecher
Copy link
Contributor Author

jfecher commented Dec 19, 2024

After some more perf testing, I'm going to close this PR. Testing it against master on rollup-base-private, this PR is compiling in ~90s compared to master's ~60s. Unsure if this is from the same regression as conditional_1 but this PR has been enough of a rabbit hole already that I'd rather close it.

@jfecher jfecher closed this Dec 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Avoid Vec allocations for single-result Instructions
2 participants