-
Notifications
You must be signed in to change notification settings - Fork 132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[drbd] 9.2.2-v1.4.1 causes immediate crash/restart on replicated volumes #155
Comments
I'm experiencing the same issue. I downgraded my cluster back to 1.3.7 and it seems to be working again. |
The dmesg logs doesn't show anything useful, probably it's the kernel module. There's a version 9.2.3 available now, worth checking the changelog (seems a lots of fixes), would someone be able to manually built this and test? |
I attempted compiling an extension with 9.2.3, but I wasn't able to get it to work due to not being familiar with the talos compilation process. My drbd kernel module was rejected due to the key not matching, I believe. If someone could build one I'd love to test it and report back. |
Not that I expected anything different since I didn't see a drbd version bump, but I just wanted to mention that I'm still experiencing this issue with Talos 1.4.2 (and drbd 9.2.2-v1.4.2 extension) |
@frezbo could you please outline the process for compiling an extension for DRBD 9.2.3 so that I can test? Thank you! |
@themicknugget could you try with this installer and extensions?
|
@frezbo Thank you so much for building that for me! I attempted, and the installer was fetchable but I got this for the extension: failed to resolve reference "ghcr.io/frezbo/drbd:9.2.3-v1.4.0-alpha.4-2-g0855dd7-dirty": failed to authorize: failed to fetch anonymous token: unexpected status: 401 Unauthorized It seems it's still marked private? |
should be fixed now |
Thank you! Unfortunately, after testing it appears to crash at the same time as 9.2.2. Did you enable additional debugging in the kernel that I can collect for you? |
this is just the standard build, i guess it's better to create an issue with drbd since it seems to have issues with linux 6.1 |
@frezbo FYI an issue was created and a fixed release of drbd is coming: LINBIT/drbd#57 (comment) |
Yesterday DRBD 9.2.4 was released containing this fix for the issue. If we could get the extension updated to 9.2.4-v1.4.5, this would be fantastic :) |
Awesome, will get it updated for the 1.5 release as part of our normal deps update. |
Bump drbd to a non-broken version. Fixes: siderolabs/extensions#155 Signed-off-by: Noel Georgi <git@frezbo.dev> (cherry picked from commit f7cd916)
The next release of Talos (both 1.5 and 1.4) should have the fixed drbd version of 9.2.4 |
Thanks a lot :) |
I am testing out piraeus-datastore using talos, and I was able to get it working wonderfully on v1.3.7 but after upgrading to v1.4.1 (and using drbd extension 9.2.2-v1.4.1), upon provisioning a PVC with more then one replica (necessitating communication between nodes) the "primary" node for the provisioned volume restarts without kernel panic/console message.
I have attached a "talosctl dmesg" log in hopes that it will help troubleshoot.
Thank you!
elite6.log
The text was updated successfully, but these errors were encountered: