Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tweak OnDemand test #3190

Merged
merged 2 commits into from
May 2, 2022
Merged

Tweak OnDemand test #3190

merged 2 commits into from
May 2, 2022

Conversation

eisenhauer
Copy link
Member

No description provided.

@JasonRuonanWang
Copy link
Member

I am getting these errors from the sanitizers too, which I don't know how to re-run. So every time it fails I have to push something to trigger a rerun...

541/643 MemCheck: #541: Staging.OnDemandDistribution.1x1x3.CommMin.BP5.SST ...................................................... Defects: 6
554/643 MemCheck: #554: Staging.OnDemandDistribution.1x1x3.CommMin.BP.SST ....................................................... Defects: 3
MemCheck log files can be found here: (<#> corresponds to test number)
/root/project/fedora-tsan/Testing/Temporary/MemoryChecker.<#>.log
Memory checking results:
data race - 9

@JasonRuonanWang
Copy link
Member

I had a lot of unstable tests before, which I all thought they were because of the unstable CI environment. But in the end I found that they were mostly my bugs. After spending some time fixing all the bugs, now I pretty much trust the CI virtual machines. If a test is unstable, then 99% of the time it's really the test itself that is unstable, rather than anything else.

The only open issue I have is MPMD on windows, which is not surprising because MPI on Windows is always quite different from a typical MPI implementation. Other than that, I haven't really seen a single case where a stable feature together with a stable test frequently fails on any of the Linux CI machines.

@JasonRuonanWang
Copy link
Member

#3193 failed 8 times on the OnDemand test and 1 time on the SST threads test.

@eisenhauer
Copy link
Member Author

I tend to agree that this will probably require a different approach. I was hoping to keep the multiple "independent process" test readers to test different code paths than would be hit if it was multiple reader streams in a single process, but it may be that just attempting to use clock-based delays to get them to make requests in some order has too high a failure rate in CI because of process scheduling variations. Lets disable for now.

@eisenhauer eisenhauer merged commit 49c13c0 into ornladios:master May 2, 2022
@eisenhauer eisenhauer deleted the OnDemandTest branch May 2, 2022 13:02
@eisenhauer
Copy link
Member Author

BTW @JasonRuonanWang , the easiest way to re-trigger tests is to do: git commit --amend --no-edit ; git push -f

@JasonRuonanWang
Copy link
Member

BTW @JasonRuonanWang , the easiest way to re-trigger tests is to do: git commit --amend --no-edit ; git push -f

Thanks. I knew this command, but this wouldn't leave any record how many times I rerun, which I needed to convince you. I said multiple times that this test fails almost every time, but apparently you didn't believe me. So I had to keep some evidence by actually pushing something into the PR, and only in this way you could see when and how it actually failed.

@eisenhauer
Copy link
Member Author

I believed you! What I didn't have was time to work on it...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants