-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Spurious test failures #19120
Comments
cc @jakub- since you've been doing rollups lately. |
Added |
cc @nodakai, looks like the new |
@alexcrichton I'll change the logic so that the test examines only the direct children of the test executable. I should have filtered processes by Parent PID rather than Session ID. |
By the way, one reason for "stdtest deadlocks" seems to be socket I ran
(Those "defunct" should have been gone with my patch.) Among the last 10 lines of the above
only the first two had been output before I hit In fact
Here we see Since all tests in I think the combination of I hope you see how using self-descriptive constants in unit tests helps debugging with |
The other two are harder ones:
These As for
and it was blocking indefinitely again at I suspect there might be a bug or a race around |
By the way, I'm against calling them "spurious" failures. We can't guarantee they are not, well, "genuine" failures. |
Reported as a part of rust-lang#19120 The logic of rust-lang/rust@74fb798 was flawed because when a CI tool run the test parallely with other tasks, they all belong to a single session family and the test may pick up irrelevant zombie processes before they are reaped by the CI tool depending on timing. Also, panic! inside a loop over all children makes the logic simpler. By not destructing the return values of Command::spawn() until find_zombies() finishes, I believe we can conduct a slightly stricter test. Signed-off-by: NODA, Kai <nodakai@gmail.com>
@nodakai thank you for quite the thorough investigation! I'll try to hone in on these failures soon, but feel free to beat me to it! |
…inder Reported as a part of rust-lang#19120 The logic of rust-lang/rust@74fb798 was flawed because when a CI tool run the test parallely with other tasks, they all belong to a single session family and the test may pick up irrelevant zombie processes before they are reaped by the CI tool depending on timing.
This test would read with a timeout and then send a UDP message, expecting the message to be received. The receiving port, however, was bound in the child thread so it could be the case that the timeout and send happens before the child thread runs. To remedy this we just bind the port before the child thread runs, moving it into the child later on. cc rust-lang#19120
This test would read with a timeout and then send a UDP message, expecting the message to be received. The receiving port, however, was bound in the child thread so it could be the case that the timeout and send happens before the child thread runs. To remedy this we just bind the port before the child thread runs, moving it into the child later on. cc #19120
This test would read with a timeout and then send a UDP message, expecting the message to be received. The receiving port, however, was bound in the child thread so it could be the case that the timeout and send happens before the child thread runs. To remedy this we just bind the port before the child thread runs, moving it into the child later on. cc rust-lang#19120
#20677 auto-win-32-nopt-t test failed. The only difference between the output list and current Collected some more
All hanged test uses |
@klutzy yes I've reached the same conclusion as well, and that set of tests all look quite familiar! When I removed I'll switch those tests over to |
These tests have all been failing spuroiusly on Windows from time to time, and one suspicion is that the shilc thread outliving the main thread somehow causes the problem. Switch all the tests over to using Thread::scoped instead of Thread::spawn to see if it helps the issue. cc rust-lang#19120
…ss-spurious These tests have all been failing spuroiusly on Windows from time to time, and one suspicion is that the shilc thread outliving the main thread somehow causes the problem. Switch all the tests over to using Thread::scoped instead of Thread::spawn to see if it helps the issue. cc rust-lang#19120
Filed a separate bug for the rather common windows failure |
From #24053
http://buildbot.rust-lang.org/builders/auto-mac-64-nopt-t/builds/4378 |
I think that this bug has somewhat outlived its lifespan, any new spurious failures can have follow-up bugs. |
Our builders are susceptible to a lot of spurious test failures, and I've been keeping some of this knowledge in my head for far too long, so I'd like to write it down. I'm writing this to be a metabug, but I expect other bugs to be spawned off from it as well. Sometimes it's just a lot easier to add a bullet than a whole new bug!
If you add to this, please don't link directly to buildbot logs as they disappear over time. Instead please link to a gist with the relevant information pasted in. I also typically peruse http://buildbot.rust-lang.org/grid?branch=auto&width=10 for build failures
Spurious test failures
thread_local::dtors_in_dtors_in_dtors
deadlocks on OSX (logs)io::process::tests::test_kill
spuriously fails (logs)Spurious failure behvaior
make clean
randomly fails due to some arbitrary object file not being able to be overwritten. (logs)The text was updated successfully, but these errors were encountered: