Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Thread-pool will add threads until system resources are exhausted #26

Open
mdbergmann opened this issue Oct 7, 2024 · 9 comments
Assignees
Labels
enhance Clicking "enhance" should not violate the laws of physics.

Comments

@mdbergmann
Copy link

Seems like the thread-pool implementation is not capping the number of maximum threads.
So submitting work in a loop, large loop, to be processed concurrently will basically kill the application because no more threads can be created.

@adlai
Copy link
Collaborator

adlai commented Oct 7, 2024

Correct, that behavior was intended when ChanL was originally written; the logic was that the same will happen if you allocate memory without bounds, and exhaust the heap.

The only limit in the current thread pool is a "soft limit", and that sets the number of worker threads kept alive waiting for new tasks after existing ones have finished.

How do you think this should be addressed? and I'm curious to hear whether you encountered this in any production system, or simply by playing around...

@mdbergmann
Copy link
Author

mdbergmann commented Oct 7, 2024

Well, I'm the author of Sento and was looking at thread-pool implementations that can replace the 'made-up' dispatchers/thread-pools in Sento that are basically just actors each operating a separate thread, which effectively in Sento is used as something like a thread-pool.
The performance tests I usually run do something in a loop of ~1M repeats.
Insofar it's just playing around. However I work with production actor based systems (in other runtimes) that are massively asynchronous. Hundreds of thousands of concurrent IO inputs/outputs is nothing unusual.

It can simply be addressed by capping the maximum number of threads.
The question is what should happen when the thread-pool has no available threads and the queue is also full. The easiest strategy then is that the caller runs the operation, that kind of is a natural back-pressure mechanism. But there are other strategies possible.

@mdbergmann
Copy link
Author

One could also use something like: https://github.com/Frechmatz/cl-threadpool

@adlai
Copy link
Collaborator

adlai commented Oct 7, 2024

Thank you for the links and ideas.

You probably noticed that ChanL is mostly constructed so different thread pool implementations can be used; one good improvement could be adding to the pcall function (and through it, to the pexec macro) an ability to swap out the thread pool; meanwhile, there would be some CLOS API of how thread pools operate, along with a few default ones, and users with special constraints or extreme demands could implement custom pools.

The current CLOS footprint is two classes thread-pool and task, and one generic, assign-task. I'm pondering whether any change to this is needed for implementing the vision of the previous paragraph, and open to suggestions before I write any code.

By the way, the use case that I've been wanting to support is as follows: I run several liquidity providers, each with about a dozen tasks; I need the ability to nuke and restart individual liquidity providers, so currently I get this by separating them to different lisp images. If each had its own thread pool, I could run all within one lisp image and conserve resources. From this perspective, it is fortunate that I'm small-time, and only run a few of these, although I'd like to keep a low overhead for scaling so I can take on additional clients (most people who want to profit from liquidity providing don't have or want the technical ability to deal with some tangled old lisp code).

@adlai adlai added the enhance Clicking "enhance" should not violate the laws of physics. label Oct 7, 2024
@adlai adlai self-assigned this Oct 7, 2024
@adlai
Copy link
Collaborator

adlai commented Oct 9, 2024

I'm considering whether the test suite should deliberately fail on lighter systems, by triggering this; and if so, how should the human be warned before the expected crash, in case someone who does not read documentation simply fires up slime, quicklisp, and ,test-system.

@adlai
Copy link
Collaborator

adlai commented Oct 19, 2024

Enhancement begun!

Commit 180c290 prefixes soft- to chanl:thread-pool; eventually, the default thread pool will use configurable hard limits, and the soft pool will only be retained in case someone needs it for flexibility or compatibility.

@adlai adlai linked a pull request Nov 20, 2024 that will close this issue
Merged
@adlai
Copy link
Collaborator

adlai commented Nov 22, 2024

I've added a survivable forkbomb to the test suite; values above seven will make dual-core machines uncomfortable for several minutes, unless you have more patience than the typical netizen. This will enable testing the failure mode of a hard-limited thread pool exceeding the upper limit, before the operating system becomes unresponsive.

@adlai
Copy link
Collaborator

adlai commented Nov 22, 2024

@mdbergmann suggested:

One could also use something like: https://github.com/Frechmatz/cl-threadpool

Definitely not; one of the hard requirements of ChanL was that the only allowed dependencies are portability layers like Bordeaux-Threads or Trivial-Garbage. The original thread pool code was taken from Eager-Future, and the complexity of that library is already out of scope for dependencies.

I'm not against the inclusion of up to one utility library, however I am aware that consensus will never be achieved and it is not worth arguing about anyway.

@adlai adlai pinned this issue Nov 28, 2024
@adlai adlai closed this as completed in #33 Dec 18, 2024
@adlai
Copy link
Collaborator

adlai commented Dec 18, 2024

Incorrectly closed by excessive GitHub automation.

@adlai adlai reopened this Dec 18, 2024
@adlai adlai removed a link to a pull request Dec 18, 2024
Merged
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhance Clicking "enhance" should not violate the laws of physics.
Projects
None yet
Development

No branches or pull requests

2 participants