-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
roachtest: roachprod put cockroach fails with Connection closed by ... #37113
Comments
This ran had #37001 which I think we were hoping would fix this, but it didn't. cc @nvanbenschoten |
cc @bdarnell too. These ssh connections getting closed like this seems to be a very common failure. We really need to do something here - but what? Does anybody know what logging can be added to ssh and/or sshd? |
This single '-v' flag enable debug1 level logging in ssh in hopes of helping root cause issues like cockroachdb#37113.
Added a PR to at least add verbosity to the It'd be nice to know what the server is logging when this happens. Maybe we should add some logic to capture journalctl logs when tests fail to set up a cluster. How do we classify this as a cluster creation failure rather than a test failure? |
We could swallow the output if the command doesn't fail, nobody cares what scp says if it reports success (different for ssh, which sees the same problems)
That's the thing I really hope can fix this issue - can't we just set up our clusters with verbose sshd logging in the first place, and capture the logs on failed tests? I'd like to avoid tracking these flakes as cluster creation failures because that adds a lot of mess to roachtest and it may even distract from fixing the root cause. |
We already do. Let's see where we are when we have some output from #37125, if it's not enough we can change the setting to
The |
The dead node detection usually seems to get in just fine, so I think our chances are good that this will just work (🤞) |
This single '-v' flag enable debug1 level logging in ssh in hopes of helping root cause issues like cockroachdb#37113.
We have marked this issue as stale because it has been inactive for |
SHA: https://github.com/cockroachdb/cockroach/commits/99306ec3e9fcbba01c05431cbf496e8b5b8954b4
Parameters:
To repro, try:
Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1260033&tab=buildLog
The text was updated successfully, but these errors were encountered: