Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Even more remote heartbeat improvements #3594

Merged
merged 13 commits into from
Jun 10, 2024
Merged

Conversation

billyb2
Copy link
Member

@billyb2 billyb2 commented Jun 4, 2024

Change Summary

What and Why:
Having the logic for creating a remote builder be in web doesn't make sense. The majority of the work can be done in flyctl, and as such, remove another point of failure in a different component

How:
Move all of the logic from EnsureMachineRemoteBuilder from web to flyctl

Related to:


Documentation

  • Fresh Produce
  • In superfly/docs, or asked for help from docs team
  • n/a

@billyb2 billyb2 force-pushed the billy/set_remote_builder branch 3 times, most recently from 942c367 to 18dd107 Compare June 6, 2024 17:36
@billyb2 billyb2 marked this pull request as ready for review June 7, 2024 20:34
Having the logic for creating a remote builder be in `web` doesn't make
sense. The majority of the work can be done in `flyctl`, and as such,
remove another point of failure in a different component
Since these operations are just queries (and don't mutate state), we can
feel free to retry them as many times as needed
If we get a non 500 error, then there's no point in having multiple
seconds waiting, we should just fail.
For whatever reason, if we pass app.Name to destroy the app when we
encounter an issue creating the builder, we pass nil (in unit tests).
Not sure why, but all good

Secondly, we should sleep between API requests
return nil, nil, err
}

if validateBuilderErr != NoBuilderApp {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this block won't occur because NoBuilderApp is an instance of ValidateBuilderError which is checked with the errors.As() above, correct?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't believe so. errors.As just checks if it's a ValidateBuilderError or not, and we only return before this check if it isn't a validation error.

https://go.dev/play/p/I5_iTNzBwiQ

flapsClient := flapsutil.ClientFromContext(ctx)

var volumes []fly.Volume
numRetries := 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use a "backoff" library elsewhere for retries. Maybe that's better to use than rolling your own?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the only issue with the backoff library is that i need to exit early if we don't get a server error, instead of retrying multiple times (for example, if we get a 404, it's likely a user error)

internal/build/imgsrc/ensure_builder.go Show resolved Hide resolved
internal/build/imgsrc/ensure_builder.go Show resolved Hide resolved
@billyb2 billyb2 requested a review from benbjohnson June 10, 2024 15:45
@billyb2 billyb2 merged commit a5aebf9 into master Jun 10, 2024
39 checks passed
@billyb2 billyb2 deleted the billy/set_remote_builder branch June 10, 2024 16:33
billyb2 added a commit that referenced this pull request Jun 11, 2024
billyb2 added a commit that referenced this pull request Jun 11, 2024
billyb2 added a commit that referenced this pull request Jun 11, 2024
This reverts commit 4fb6bd4. Ben T made
some changes that should let this work again. We'll need to add a
preflight test to ensure that this doesn't break in the future, however.
billyb2 added a commit that referenced this pull request Jun 12, 2024
* Restart remote builder machine if it's stopped

* Fix tests

* Reapply "Even more remote heartbeat improvements (#3594)"

This reverts commit 4fb6bd4. Ben T made
some changes that should let this work again. We'll need to add a
preflight test to ensure that this doesn't break in the future, however.

* Add the recreate builder flag

This allows users forcing recreating the remote builder, even if it's valid

* Add a test for deploying and recreating the remote builder

* Catch and trace errors
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants