-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bugfix] Membership service enable retry db connection #37
Conversation
753a269
to
40d4f25
Compare
If you have the time would you update your commits to use the propper tag? in this case it would be 🐛, we internally use Gitmoji sorry that this is not documented, still early days of unguard opensource 😄 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The proxy service Dockerfile changes should not be part of this PR.
connectString, | ||
new MySqlServerVersion(new Version(10, 5, 8)), | ||
options => options.EnableRetryOnFailure( | ||
maxRetryCount: 10, | ||
maxRetryDelay: System.TimeSpan.FromSeconds(60), | ||
errorNumbersToAdd: null) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[QUESTION] Don't fully understand, we don't execute this code, do we? or does this throws only after 10 minutes (after retrying once every minute 10 times) and only THEN we get into the catch block and restart the pod by exiting the process?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We try to connect 10 times, with increasing delay each time. (e.g. first 10ms, second 150ms... up to 60s). If the connection cant be established at the 10th try, we get to the try-catch. Then we stop the program, and the pod restarts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Locally with ArgoCD the connection was established at around the 5th try. I am using such a high retryCount and retryDelay to prevent restarting the pod to frequently.
@@ -49,6 +49,8 @@ private static void CreateDbIfNotExists(IHost host) | |||
{ | |||
var logger = services.GetRequiredService<ILogger<Program>>(); | |||
logger.LogError(ex, "An error occurred creating the DB."); | |||
// Restart the pod until success | |||
System.Environment.Exit(1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[QUESTION] does this executes after trying to connect 10 times (10 minutes) (see comment above)? or on each failure?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It executes after trying 10m times. But not after 10 minutes, as the delay increases each retry a little until it reaches maxRetryDelay.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove the comment about the pod being restarted until success, we shouldn't refer to how the app is being run here.
1b3afe5
to
4419fd2
Compare
When using ArgoCD for deployment, "wait:true" to wait for the database in the skaffold.yaml, is not applied, instead all services are created in parallel. The membership-service tries to connect once to the not yet ready db, and then waits, doing nothing. To mitigate this the option EnableRetryOnFailure was added. Now the service tries in a timespan of 60 seconds to connect ten times, and if it still fails exists the program, causing a recreation of the pod.
0d33363
to
08922f3
Compare
When using ArgoCD for deployment, "wait:true" to wait for the database
in the skaffold.yaml, is not applied, instead all services are created
in parallel. The membership-service tries to connect once to the not yet ready db,
and then waits, doing nothing. To resolve this the option EnableRetryOnFailure
was added. We try to connect 10 times, with increasing delay each time. (e.g. first 10ms, second 150ms... up to 60s). If the connection cant be established at the 10th try, we get to the try-catch. Then we stop the program, and the pod restarts.