Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: clearer Synthetic PoRep separation #1720

Merged
merged 4 commits into from
Dec 1, 2023
Merged

Conversation

vmx
Copy link
Contributor

@vmx vmx commented Sep 22, 2023

The Synthetic PoRep code is interleaved with the original PoRep code. This commit refactors the code, so that it's clearer what the differences between Synthetic and Non-Synthetic PoRep is, without the need to read through large parts of the code, looking for if-statements.

It also becomes more apparent in the code, that there are two "modes" of the prove_layers function in case of the Synthetic PoRep. One generation step in case there's no seed and one extracting step when there's a seed.

There no longer is a fallback in case from Synthetic PoRep to Non-Synthetic PoRep in case the Synthetic PoRep Proofs file cannot be read, but when the labels are set. I consider that a feature as if this case happens, your system likelt has some problems, so it's better to error early.

@vmx
Copy link
Contributor Author

vmx commented Oct 2, 2023

As this PR wasn't reviewed yet, I did dare to rebase it. I also pushed a new commit. It's easiest reviewed if you look at those two commits separately. The second one is again easiest with whitespace only changes hidden: 4f46216?diff=split&w=1

t_aux.synth_proofs_path(),
partition_count,
)
.map_err(|error| {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the case where Synth-PoRep is enabled and the challenge seed is set, but synthetic challenge proofs do not exist on disk, I think the fallback behavior should be to run normal PoRep (rather than return an error). In the event where, while waiting for the challenge seed on chain, a user's synthetic challenge proofs were moved/removed from their expected location, returning an error here would run in an endless loop (assuming the user were to rerun prove_layers following the read synth-proofs error).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a user runs a Synthetic PoRep, you would do that in order to free space. So you would remove the tree files/layers once the proofs are generated. If you move/remove those proofs and fallback to non-Synthetic PoRep, it would likely fail, because the original files are not there anymore. I would find that behaviour more confusing/hard to follow that failing early on the Synthetic PoRep code path.

This is kind of a discussion, whether we want "fail early" behaviour, which I prefer as I find it easier to debug/finding root cases if it happend. Vs. "best effort", which might then make things fail later on in unexpected ways.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think that it's totally debatable, and that the error makes sense too. Do you you have an opinion about falling back to standard PoRep only when the SDR layers weren't deleted? Or do you think an error should be returned in that case too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you you have an opinion about falling back to standard PoRep only when the SDR layers weren't deleted? Or do you think an error should be returned in that case too?

That might be a good middle ground. Though if it would be me, I think would still return early as I really see it as some misconfiguration/user error which under normal operations should never happen. And If someone did something wrong, they probably want to find out about it early. I'm also thinking about potential bug reports, which might be easier to tackle with early failure.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok with erroring as it would force a user to immediately diagnose and fix the issue. My thinking for why standard PoRep should be run if the labels are still present on disk (with a logged warning saying "synth-porep was not run") is mainly because I think that re-sealing/labeling should be avoided when possible. However, if that fallback behavior is repeatedly taken, a user may be mislead into thinking that they are using synth-porep when in reality they aren't.

Maybe @cryptonemo has an opinion, but if not, I'm happy to approve.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thought on it is that we should fallback to PoRep if it's possible and warn/log very loudly that something is misconfigured and SynthPoRep cannot be used. I agree that will complicate potential code/testing paths, but shouldn't complicate much past having the user reading the logs and clearly seeing there's a problem.

Maybe the more ideal answer from a user perspective is to check if the synth_proofs data exists in the clear data call (when enabled of course) and log loudly there that they cannot be cleared because synth_proofs data doesn't exist -- and then we have the logic to fall back to PoRep because something is wildly misconfigured ???

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that will complicate potential code/testing paths, but shouldn't complicate much past having the user reading the logs and clearly seeing there's a problem.

I disagree here. Especially with Ni-PoRep there's yet another code path. I find the current code pretty clean, you can clearly see the different PoRep cases and reason about what happens when. With certain fallback it likely be a lot of additional back and forth, which I think is just not worth it.

From a users perspective I see it as: if i've misconfigured wildly I want to immediately know it, rather then having a system that tries random things just in hope it might recover somehow, but might still end up in a broken state. An example for today: SupraSeal's PC2 does not support Synthetic PoRep. When run with Lotus and SyntheticPoRep enabled, it runs into a seemingly endless loop, but then does something (the wrong thing). Wouldn't it be much better if it would just say: "error, you try to run Synthetic PoRep although it's not supported"?

So if there's any chance you could live with the current solution I'd prefer keeping it the current way.

If you can't live with it, should we then also start thinking about how it will look like if we also add Ni-PoRep. Should there also be fallbacks?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, that's fair. Thanks!

@DrPeterVanNostrand
Copy link
Contributor

This refactor looks good to me. I just have the one question regarding fallback behavior when synth proofs do not exist on disk.

@vmx vmx force-pushed the synthetic-refactor branch 2 times, most recently from 041b0ae to 2779a36 Compare October 9, 2023 10:18
The Synthetic PoRep code is interleaved with the original PoRep code. This commit
refactors the code, so that it's clearer what the differences between Synthetic
and Non-Synthetic PoRep is, without the need to read through large parts of the
code, looking for if-statements.

It also becomes more apparent in the code, that there are two "modes" of the
`prove_layers` function in case of the Synthetic PoRep. One generation step in
case there's no seed and one extracting step when there's a seed.

There no longer is a fallback in case from Synthetic PoRep to Non-Synthetic PoRep
in case the Synthetic PoRep Proofs file cannot be read, but when the labels are
set. I consider that a feature as if this case happens, your system likelt has
some problems, so it's better to error early.
`prove_layers_generate()` now operates on a single partition. This makes it
clearer that Synthetic PoReps always operate on a single partition only.

With the challenge generation moved outside of this function, it's also a
clearer separation of concerns, as it now operates directly on the challenges
given, without know anything about how they were generated. This also useful
for the work of having small, special case binaries for certain operations.
@vmx
Copy link
Contributor Author

vmx commented Oct 31, 2023

The force push has just been a rebase, without any other code changes.

t_aux.synth_proofs_path(),
partition_count,
)
.map_err(|error| {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, that's fair. Thanks!

@vmx vmx merged commit 4ffa4a8 into master Dec 1, 2023
32 checks passed
@vmx vmx deleted the synthetic-refactor branch December 1, 2023 13:57
vmx added a commit that referenced this pull request Apr 11, 2024
Due to me understanding the code wrongly during the refactoring at
#1720, there
was a wrong assert inserted into the code. With removing this assert
the lifecycle 32GiB test as well as the benchy synthetic proofs
generation works as expected again.

This commit also updates the README with the correct command to run
benchy for synthetic proofs generation.
vmx added a commit that referenced this pull request Apr 11, 2024
Due to me understanding the code wrongly during the refactoring at
#1720, there
was a wrong assert inserted into the code. With removing this assert
the lifecycle 32GiB test as well as the benchy synthetic proofs
generation works as expected again.

This commit also updates the README with the correct command to run
benchy for synthetic proofs generation.

Fixes #1749.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants