Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Python] About Thread Pool #187

Closed
luojw-dwr opened this issue Sep 27, 2024 · 9 comments
Closed

[Python] About Thread Pool #187

luojw-dwr opened this issue Sep 27, 2024 · 9 comments

Comments

@luojw-dwr
Copy link

luojw-dwr commented Sep 27, 2024

Using the python interface, how can I release the threads allocated by mtkahypar.initializeThreadPool(...)?

@kittobi1992
Copy link
Member

Hi @luojw-dwr,

This is a good question. We currently do not have any interface function to destruct the internal thread pool of tbb. The initializeThreadPool function constructs the tbb::global_control object that defines maximum available parallelism. However, I don't think it is responsible for creating the threads. This will be handled by TBB task scheduler internally. The threads currently live as long as the python module is loaded. However, idling threads do not consume any CPU and their memory footprint should be also fairly small. Modern OS should also be able to handle a large amount of inactive threads nowadays.

Is there any concern that you have if these threads are not released in your application?

Best,
Tobias

@luojw-dwr
Copy link
Author

luojw-dwr commented Sep 27, 2024

Hi @kittobi1992

Thanks for your time and reply. I am using mtkarhypar in a shared environment which sets a hard limitation to the number of user threads. The problem happens when the python program incorporates with other libraries (such as python-mip) that has another thread pool, thus summing up the overall thread count beyond the hard limit.

As a workaround, I may separate the partitioning part of my program to a standalone process. But it would be cool if the thread pool can be temporarily destructed, and later on initializeThreadPool again if needed.

Thank you again for your warm reply.

@kittobi1992
Copy link
Member

Hi @luojw-dwr,

Just found out that it is possible to terminate the TBB worker threads:
https://oneapi-spec.uxlfoundation.org/specifications/oneapi/latest/elements/onetbb/source/task_scheduler/scheduling_controls/task_scheduler_handle_cls

I will try to integrate it by the end of the week.

Best,
Tobias

@kittobi1992
Copy link
Member

kittobi1992 commented Oct 2, 2024

Opened a PR that allows terminating the TBB thread pool via a library function. However, our CI is currently not working with the change since it requires oneTBB 2021.6 but our Ubuntu machines are running on 2021.5. I keep you updated but you can already use the branch for testing.

@luojw-dwr
Copy link
Author

luojw-dwr commented Oct 5, 2024

edit: resolved.

Hi @kittobi1992
Tried this version with debug build, import mtkahypar causes a segmentation fault immediately, with no debug info printed.

edit: The behavior is observed with the latest tbb. Trying downloaded TBB.

edit: With both TBB, xxx/mt-kahypar/build/tests/interface/interface_test: symbol lookup error: xxx/build/tests/interface/interface_test: undefined symbol: mt_kahypar_terminate_thread_pool are reported during make.
edit: The deleted edit is caused by LIBRARY_PATH containing previous builds. However, with the interface_test problem resolved, the following problems still apply. Besides, in interface_test.cc, it seems that APartitioner::SetUp and APartitioner::TearDown are not called during the tests.

With latest TBB (2021.13)

With the latest TBB (2021.13), import mtkahypar leads to segmentation fault.

With downloaded TBB (2021.7)

With the downloaded TBB (2021.7),

Following are two minimal examples:

import mtkahypar
mtkahypar.initializeThreadPool(4)
mtkahypar.terminateThreadPool() # <- segmentation fault immediately
import mtkahypar
mtkahypar.initializeThreadPool(1)
mtkahypar.terminateThreadPool() # <- segmentation fault immediately

and one longer example:

import mtkahypar as mtkhp
mtkhp.initializeThreadPool(4) # any thread count
ctx = mtkhp.Context()
ctx.loadPreset(mtkhp.PresetType.DEFAULT)
ctx.setPartitioningParameters(2, 0.03, mtkhp.Objective.KM1)
HG = mtkhp.Hypergraph("a.hmetis", mtkhp.FileFormat.HMETIS) # any legal input file
mtkhp.terminateThreadPool() # <- segmentation fault immediately

with a.hmetis as:

1 2
1 2

-2: Ubuntu 24.04, x86_64
-1: mt-kahypar: git hash 39e06c7, with the PR thread_pool_termination merged locally
0. python: 3.8.20

  1. compiler: clang++-18
  2. cmake: 3.28.3
  3. boost: 1.86
  4. TBB: 2021.7, by -DKAHYPAR_DOWNLOAD_TBB=On
  5. hwloc: 2.11.1

@luojw-dwr
Copy link
Author

Hi @kittobi1992

Would please share your compilers and dependency versions? Especially compiler and TBB. Thank you.

@luojw-dwr
Copy link
Author

luojw-dwr commented Oct 7, 2024

Hi @kittobi1992
When using python interface, as long as the module mtkahypar lives, an instance of tbb::global_control is alive, which outlives the moment calling tbb::finalize. Please help to check if you can reproduce the segmentation fault aforementioned.

I am trying other methods, hope that I will get a good message to report.

edit: refer to the following comment.

@luojw-dwr
Copy link
Author

luojw-dwr commented Oct 7, 2024

Hi @kittobi1992
I found the reason for the segmentation fault. In python/module.cpp, it should be:

bool terminate_thread_pool() {
  return mt_kahypar::TBBInitializer::instance().terminate();
}

rather than:

bool terminate_thread_pool() {
  mt_kahypar::TBBInitializer::instance().terminate();
}

(notice the missing return). Surprisingly it could pass the compilation.

Good news: After the fix,

  1. Threads terminate correctly after mtkahypar.terminateThreadPool()
  2. Threads automatically spawned when new mtkhp.Hypergraph are created.

Thank you for your time, effort and patience.

@kittobi1992
Copy link
Member

Hi @luojw-dwr,
Sorry for my late response and thanks for investigating the issue by your own. I have currently not setup the python build on my Mac and relied on the CI which does not pass at the moment (see comments in PR). I will try to make it work tomorrow, and get the change merged.

Also good to hear that it is working for you.

Best,
Tobias

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants