Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OnDemand race condition. #4371

Open
eisenhauer opened this issue Oct 17, 2024 · 0 comments
Open

OnDemand race condition. #4371

eisenhauer opened this issue Oct 17, 2024 · 0 comments

Comments

@eisenhauer
Copy link
Member

I suspect that there is a race condition associated with this the OnDemand request queue. I'm guessing that it only happens when on the last step the metadata going to reader crosses paths with the request coming from the reader to the writer. If we had another step on the writer, we'd notice in EndStep that we had this request and it was no longer useful, but since it's the last step we fall through into Close() and don't properly deallocate the request. This is quite rare, so probably not a high priority, but it does seem to happen, as per sanitizer output below.

Script directory is /root/project/fedora-asan/bin
current working directory is /root/project/fedora-asan/testing/adios2/engine/staging-common
TestDriver: Writer command line : /root/project/fedora-asan/bin/TestOnDemandWrite SST Staging.OnDemandSingle.1x1.CommMin.BP5.SST CPCommPattern=Min,MarshalMethod=BP5
TestDriver: Reader command line : /root/project/fedora-asan/bin/TestOnDemandRead SST Staging.OnDemandSingle.1x1.CommMin.BP5.SST
TestDriver:MPMD value is False

TestDriver: Doing simple with file_test = False
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from SstOnDemandWriteTest
[ RUN ] SstOnDemandWriteTest.ADIOS2SstOnDemandWrite
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from SstOnDemandReadTest
[ RUN ] SstOnDemandReadTest.ADIOS2SstOnDemandRead
SST,Write,1,10,100,10,20317,74537.9
SST,Read,1,10,100,10,11875,975867
[ OK ] SstOnDemandWriteTest.ADIOS2SstOnDemandWrite (1327 ms)
[----------] 1 test from SstOnDemandWriteTest (1327 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test suite ran. (1327 ms total)
[ PASSED ] 1 test.
[ OK ] SstOnDemandReadTest.ADIOS2SstOnDemandRead (1363 ms)
[----------] 1 test from SstOnDemandReadTest (1363 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test suite ran. (1363 ms total)
[ PASSED ] 1 test.
TestDriver: Reader exit status was 0
TestDriver: Writer exit status was 1
TestDriver: Writer failed, causing test failure
TestDriver: Exiting with overall failure code

Suppressions used:
count bytes template
2 301 ps_make_timer_name_
1 8 ibv_get_device_list

=================================================================
==55461==ERROR: LeakSanitizer: detected memory leaks

Direct leak Direct leak of 16 byte(s) in 1 object(s) allocated from:
#0 0x530707 in calloc (/root/project/fedora-asan/bin/TestOnDemandWrite+0x530707)
#1 0x7f6a71303ae8 in CP_ReaderRequestStepHandler /root/project/source/source/adios2/toolkit/sst/cp/cp_writer.c:2578:27
#2 0x7f6a6ee3c9b1 in CMact_on_data /root/project/source/thirdparty/EVPath/EVPath/cm.c:2701:3
#3 0x7f6a6ee354e1 in CMDataAvailable /root/project/source/thirdparty/EVPath/EVPath/cm.c:2196:15
#4 0x7f6a6b129324 in socket_select /root/project/source/thirdparty/EVPath/EVPath/cmselect.c:466:7
#5 0x7f6a6b126db7 in libcmselect_LTX_blocking_function /root/project/source/thirdparty/EVPath/EVPath/cmselect.c:1030:5
#6 0x7f6a6ee1f6e6 in CMcontrol_list_wait /root/project/source/thirdparty/EVPath/EVPath/cm.c:712:2
#7 0x7f6a6ee18452 in CMpoll_forever /root/project/source/thirdparty/EVPath/EVPath/cm.c:184:6
#8 0x7f6a6ee1a5d4 in server_thread_func /root/project/source/thirdparty/EVPath/EVPath/cm.c:207:5
#9 0x7f6a6f2ed431 in start_thread (/lib64/libpthread.so.0+0x9431)


Suppressions used:
count bytes template
5 751 ps_make_timer_name_
1 8 ibv_get_device_list

SUMMARY: AddressSanitizer: 16 byte(s) leaked in 1 allocation(s).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant