-
Notifications
You must be signed in to change notification settings - Fork 571
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ASSERT type_is_instr in multiple drcachesim online tests due to weird pipe ordering issue #3320
Comments
Happened again:
|
This same assert just happened in
|
warmup-zeros failed again with that assert: https://ci.appveyor.com/project/DynamoRIO/dynamorio/builds/23571201 |
Happened on the missfile test: |
Happened on reuse_distance: https://ci.appveyor.com/project/DynamoRIO/dynamorio/builds/25011472 |
On missfile again: https://ci.appveyor.com/project/DynamoRIO/dynamorio/builds/25102162 |
On TLB-threads again: https://ci.appveyor.com/project/DynamoRIO/dynamorio/builds/25727905 |
Happened in tool.drcachesim.coherence with a timeout being reported: #3803. |
Happened on riscv reuse-distance: |
Data point: Another failure, in master:
https://github.com/DynamoRIO/dynamorio/actions/runs/11185116691/job/31097308404 |
I don't suppose anyone has observed this phenomenon in the absence of DynamoRIO? I wrote a 60-line program that reads or writes blocks of data down a pipe and checks for anything arriving out of order. I used 500-byte blocks and had 8 writers and one reader, and I ran DynamoRIO's tests with |
The problem only appears when the tests are run as part of the test suite- oddly enough. I ran it hundreds of times invoking the tests directly and never hit the problem. |
It frequently fails in the same way as other drcachesim tests on SVE hardware. Issue: #3320 Change-Id: I2383af3ca8af584f769ebd8e68fc9a0a82928ed1
This failure has made the aarch64-* suites red nearly half the time for many months now, yet we don't think the failure itself is specific to aarch64 or a particular kernel version. It turns out this is all due to the #2204 feature of retrying a failing test in the suite and only marking it red if it fails 3x in a row not being enabled on the aarch64-* suites as they do not have the CMake 3.17+ version required. See #7222 (comment). |
With 6549e88 I ran
I disabled the So disabling the |
You can see that #4167, which is the other visible message |
If we can find a way to clean up the stale pipe, the retry-3x feature of #2204 should help avoid these sporadic failures turning suites red. As noted at #2204 (comment), typically we have one failure followed by the 2x retries timing out due to the stale pipe file. |
Adds removal of the drcachesim pipe before and after running each test, to avoid a stale pipe file causing a retry-on-failure to time out. Tested by adding "-unsafe_crash_process" to the options and then: ``` $ ctest -V -R coherence <...> 325: <Application 325: /usr/local/google/home/bruening/dr/git/build_x64_dbg_tests/suite/tests/bin/client.annotation-concurrency 325: (465432). DynamoRIO Cache Simulator Tracer internal crash at PC 325: 0x00007f46cc0614b4. Please report this at http://dynamorio.org/issues. 325: Program aborted. <...> $ ls suite/tests/drtestpipe* ls: cannot access 'suite/tests/drtestpipe*': No such file or directory ``` Issue: #3320, #2204
Adds removal of the drcachesim pipe before and after running each test, to avoid a stale pipe file causing a retry-on-failure to time out. This should help prevent #3320 from turning the suite red from what should be a single failure passing on a retry. Modifies the asm tests to print to stderr instead of stdout for proper ordering within runcmp.cmake (as well as to match the conventions of all other tests). Tested by adding "-unsafe_crash_process" to the options and then: ``` $ ctest -V -R coherence <...> 325: <Application 325: /usr/local/google/home/bruening/dr/git/build_x64_dbg_tests/suite/tests/bin/client.annotation-concurrency 325: (465432). DynamoRIO Cache Simulator Tracer internal crash at PC 325: 0x00007f46cc0614b4. Please report this at http://dynamorio.org/issues. 325: Program aborted. <...> $ ls suite/tests/drtestpipe* ls: cannot access 'suite/tests/drtestpipe*': No such file or directory ``` Issue: #3320, #2204
The 32-bit version failed once:
https://ci.appveyor.com/project/DynamoRIO/dynamorio/builds/21162519
The text was updated successfully, but these errors were encountered: