-
Notifications
You must be signed in to change notification settings - Fork 929
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Instance: Fix containers not always starting up after host reboot #13700
Conversation
…p after host reboot Ignore liblxc.ErrNotRunning error when stopping. If the container refuses to stop, then check if the error is ErrNotRunning, and if so ignore it, because sometimes if an earlier shutdown request was sent, but timed out, the actual guest shutdown can still be proceeding and the container may have reached a stop state by now and is in the process of running the onStop hook to cleanup host side devices. If we returned here with ErrNotRunning then this would be incorrect as the onStop hook could still be running and we aren't fully cleaned up yet, which can cause issues with state reporting after Stop has returned. Signed-off-by: Thomas Parrott <thomas.parrott@canonical.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cluster tests are failing
Thanks @MusicDin looks like an issue when the cluster DB is offline and we try and shutdown the last LXD member:
|
LoadFromBackup was originally intended to work without a DB, however over time it has been accidentally modified to depend on the DB again. Now the onStop hook issue has been resolved that has highlighted this issue. Signed-off-by: Thomas Parrott <thomas.parrott@canonical.com>
And improved error messages. Signed-off-by: Thomas Parrott <thomas.parrott@canonical.com>
That fixed it! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! This looks good, especially having less DB queries.
Just curious whether it occur that the database contains different expanded config compared to what is in backup file? Though, I cannot think of any scenario
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, thank you!
Its possible, but unlikely, as the backup file is written from the DB at each start up or config change. |
Ignore
liblxc.ErrNotRunning
error when stopping.If the container refuses to stop, then check if the error is ErrNotRunning, and if so ignore it, because sometimes if an earlier shutdown request was sent, but timed out, the actual guest shutdown can still be proceeding and the container may have reached a stop state by now and is in the process of running the onStop hook to cleanup host side devices. If we returned here with ErrNotRunning then this would be incorrect as the onStop hook could still be running and we aren't fully cleaned up yet, which can cause issues with state reporting after Stop has returned.