Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UCT/IB/RDMACM: Disable rdmacm support when verbs is disabled #10460

Merged
merged 1 commit into from
Feb 3, 2025

Conversation

tvegas1
Copy link
Contributor

@tvegas1 tvegas1 commented Jan 29, 2025

What?

Do not advertize rdmacm in the build_module list when --without-verbs is specified.

Why?

gtest crashes because enum_md_resources() calls dlopen(rdmacm) which ends-up loading the unrelated system-wide library, which in turn loads the system-wide uct ib.

==== backtrace (tid:1050034) ====
 0 0x0000000000012ce0 __funlockfile()  :0
 1 0x000000000005b319 ucs_config_parser_parse_field()  src/ucs/config/parser.c:1333
 2 0x000000000005b537 ucs_config_parser_set_default_values()  src/ucs/config/parser.c:1384
 3 0x000000000005b50d ucs_config_parser_set_default_values()  src/ucs/config/parser.c:1378
 4 0x000000000005c560 ucs_config_parser_fill_opts()  src/ucs/config/parser.c:1831
 5 0x0000000000017c43 uct_config_read()  src/uct/base/uct_component.c:151
 6 0x0000000000014559 uct_md_config_read()  src/uct/base/uct_md.c:269
 7 0x0000000000756dbf uct_test::enum_resources()  test/gtest/uct/uct_test.cc:417
 8 0x00000000005d0fbe gtest_rc_verbsuct_atomic_key_reg_rdma_mem_type_EvalGenerator_()  test/gtest/uct/test_atomic_key_reg_rdma_mem_type.cc:43

@yosefe
Copy link
Contributor

yosefe commented Jan 29, 2025

  1. what is the "unrelated system-wide library" in the PR description?
  2. can we add CI test to check it?

@tvegas1
Copy link
Contributor Author

tvegas1 commented Jan 29, 2025

  1. we pick-up /usr/lib64/ucx/libuct_rdmacm.so.0 which depends on /usr/lib64/ucx/libuct_ib.so.0
  2. i doubt it would be easy to add, apart making sure that ./configure-devel --without-verbs | grep module.*rdmacm remains empty?

@yosefe
Copy link
Contributor

yosefe commented Jan 29, 2025

  1. thx for the clarification
  2. maybe we can check in one of the files produced by ./configure

@tvegas1 tvegas1 force-pushed the rdmacm_without_verbs branch from deffdc4 to b934e4f Compare January 29, 2025 18:04
@tvegas1
Copy link
Contributor Author

tvegas1 commented Jan 29, 2025

added it

run_configure_tests() {
echo "==== Run configure tests ===="

rm -f config.log || :
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why need to remove config.log? configure-release would rewrite it anyway

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@tvegas1 tvegas1 force-pushed the rdmacm_without_verbs branch from b934e4f to e83d826 Compare January 30, 2025 07:46
@tvegas1 tvegas1 force-pushed the rdmacm_without_verbs branch from e83d826 to 4b4f388 Compare January 30, 2025 07:48
@yosefe yosefe enabled auto-merge January 31, 2025 07:47
@tvegas1
Copy link
Contributor Author

tvegas1 commented Feb 3, 2025

/azp run UCX PR

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@yosefe yosefe merged commit f37683e into openucx:master Feb 3, 2025
147 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants