-
Notifications
You must be signed in to change notification settings - Fork 212
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Flaky tests on Linux-aarch64 build #368
Comments
Dang, I guess the CI was old, but I figured that nothing else had really changed on rviz... I'll revert it and look into the issue. |
Reverted in #369 |
@wjwwood |
Is there anything else to be done here, or did #369 solve the issue? |
I was leaving this open until the revert could be undone, but I guess we could close this if desired since it only really applies to fixing the CI, not also fixing the originating change and getting it re-merged. |
The revert will definitely not have changed anything. The commit only introduced changes in I recall that we had the same test failures in the nightlies when enabling display tests on Linux systems back in July and I don't remember them ever having been resolved. I tried to debug the issue once but couldn't reproduce - but it seems to have been Ogre not coming up successfully... |
There has been a slew of test failures in the nightlies over the past few days for |
I was working under an assumption that @jacobperron saw these as new flaky tests, indicating this was true:
If the flaky-ness was preexisting then we could undo the revert and address the flaky tests separately. I didn't investigate deeply. I was just trying to keep CI clean leading up to the release. |
I am sure that the flaky aarch64 tests are not related to #297 as we have seen those also in the past. We had no direct access to aarch64 machines and thus could not debug as these problems do not show on amd64. |
@andreasholzner If we can get you access to an aarch64 machine (probably in AWS), would you or @Martin-Idel have time to look into this? |
So it seems like the changes in #297 were not the cause of this? Does it just make it more flaky? I'm about to do the release for Crystal patch 1 and need to resolve this... |
It does seem that they are not the cause. I wasn't aware of the previous failures when I originally reported. I can't say that it makes it more flakey. I think I opened this issue after seeing the failures for two or three days. I'd say it would be okay to add the change back. |
Here's the (original?) issue I missed: ros2/build_farmer#144 |
I could (almost) reproduce the test results, however, the console output is a little different. Output seen while trying to reproduce
The line starting with I could trace the segfault to the call to
The call stack is not helpful.
I am out of ideas. Maybe a Ogre upgrade could help, but this would require some work. During a quick try to use the Ogre 1.11.5 the |
Is there any chance that these test failure will be addressed in the near future? If not I would propose to exclude the affected packages ( |
I don't see how so I am in favor of excluding the packages from the nightly job. |
This should have been solved by #394 (at least that's what my test showed there). Are the tests reenabled? Could somebody try it again? |
I just ran a repeated job with this stuff reenabled: https://ci.ros2.org/job/ci_linux-aarch64/9301/. While there are test failures, none of them are in rviz-related components, so this is probably fixed. I'm going to close this out and open a PR to reenable these tests on the nightlies. |
E.g. https://ci.ros2.org/job/ci_linux-aarch64/2467/
Seems that repeating tests is causing something to crash and not report test results.
Looks like this issue was introduced by #297.
The text was updated successfully, but these errors were encountered: