Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixups/statefilerecovery #1358

Merged

Conversation

wjuan-mob
Copy link
Contributor

@wjuan-mob wjuan-mob commented Jan 27, 2022

Soundtrack of this PR: link to song that really fits the mood of this PR

Motivation

Better state file management to avoid losing a relevant key. Based on this PR by @garbageslam: #1085

In this PR

  1. Add functionality to delete the statefile (except instead of deleting, we rename it to .bak, in case it will help debug something.)
  2. Delete the statefile if initializing the ingest enclave fails. This is generally going to indicate that we couldn't unseal the private key. (Enclave creation failing is a separate issue, and that could be caused by an SGX daemon not being woken up yet)
  3. If we did successfully load a key, don't panic even if restoring from state file failed.
  4. Besides this, we generally improved error handling around the creation of the IngestSgxEnclave object
  5. Adding a initialization status to state to indicate when partially initialized for future error handling.

This resolve issue #1082

Future Work

Copy link
Contributor

@jcape jcape left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, some stylistic issues.

fog/ingest/enclave/src/lib.rs Outdated Show resolved Hide resolved
fog/ingest/enclave/src/lib.rs Outdated Show resolved Hide resolved
fog/ingest/server/src/controller.rs Outdated Show resolved Hide resolved
fog/ingest/server/src/controller_state.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@cbeck88 cbeck88 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't add the partially initialized state stuff

  • It seems like this state should not be leaving the ingest node and be communciated to overseer or anyone else
  • If more work needs to happen to get it into a healthy state, then it should do that work itself and then become fully initialized. If no one else needs to know about it then don't put it in controller state.
  • Adding more and more states to the system increases its complexity, this should not be done casually and especially not when the additional state is dead code. There need to be wide-ranging discussions including all fog eng's and ops if ingest is going to have an increased number of states and behaviors. So adding this now may be premature

@wjuan-mob
Copy link
Contributor Author

Don't add the partially initialized state stuff

  • It seems like this state should not be leaving the ingest node and be communciated to overseer or anyone else
  • If more work needs to happen to get it into a healthy state, then it should do that work itself and then become fully initialized. If no one else needs to know about it then don't put it in controller state.
  • Adding more and more states to the system increases its complexity, this should not be done casually and especially not when the additional state is dead code. There need to be wide-ranging discussions including all fog eng's and ops if ingest is going to have an increased number of states and behaviors. So adding this now may be premature

I agree with this after consideration. I had been initially concerned about the possibility of being in a partially initialized state. But I feel like the code has sufficient safety around the scenario where it fails to set_peers or otherwise that it is not necessary to add this code, which I agree is currently dead. It can be added in the future as needed with Fog Overseer. I think it is safe to be just set to idle with no other flags.

Copy link
Contributor

@cbeck88 cbeck88 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, thank you!

@wjuan-mob wjuan-mob merged commit c36b0a8 into mobilecoinfoundation:master Jan 28, 2022
@wjuan-mob wjuan-mob linked an issue Jan 28, 2022 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fix up fog ingest handling of corrupted state file
3 participants