Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Always use Configuration#Concurrency, not std::thread::hardware_concurrency() #9643

Merged
merged 6 commits into from
May 23, 2023

Conversation

Al2Klimov
Copy link
Member

@Al2Klimov Al2Klimov commented Jan 27, 2023

refs #7842
fixes #9147

@cla-bot cla-bot bot added the cla/signed label Jan 27, 2023
@Al2Klimov Al2Klimov requested a review from julianbrost January 27, 2023 12:19
@Al2Klimov
Copy link
Member Author

On its own from #7845. You said this is an improvement by itself.

@Al2Klimov Al2Klimov added the core/quality Improve code, libraries, algorithms, inline docs label Jan 27, 2023
lib/base/threadpool.hpp Outdated Show resolved Hide resolved
@Al2Klimov Al2Klimov self-assigned this Jan 27, 2023
@Al2Klimov Al2Klimov force-pushed the hardware_concurrency branch from f2754a4 to 0e415b0 Compare January 27, 2023 16:16
@Al2Klimov Al2Klimov removed their assignment Jan 27, 2023
@Al2Klimov Al2Klimov requested a review from julianbrost January 27, 2023 16:52
@lippserd lippserd added this to the 2.14.0 milestone Mar 30, 2023
@lippserd lippserd changed the title Always use Configuration#Concurrency, not std::thread::hardware_concurrency() Always use Configuration#Concurrency, not std::thread::hardware_concurrency() Mar 30, 2023
@julianbrost
Copy link
Contributor

Making Post() implicitly start the threads compared to an explicit Start() looks fragile. You wouldn't really notice a call to Post() that happens too early (before Configuration::Concurrency was set) which would mean the pool could potentially still be initialized with the incorrect size.

Or worse, a Stop(); Post(); fork(); sequence would accidentally start the threads again. Given that you could change the return type of Post() from bool to void, it looks like no caller is checking the return value, which should correspond to expecting this call to happen in between Stop(); and Start();.

Anyways, if this PR is correct, it would also be correct to delay the initialization of the thread pool until after Configuration::Concurrency was set. And this sounds more robust to me (maybe even make Post() throw an exception on uninitialized pools, at least in debug builds) as you can't get a surprise pool initialization with that approach. How thoroughly did you go to Post() calls to make sure none of the existing ones may happen too early?

@Al2Klimov
Copy link
Member Author

What about the following?

  1. Initial concurrency: 0 which is effectively 1
  2. This way Application::InitializeBase() starts a minimal TP not running into Icinga 2 hangs w/o error message when running into cgroup limits #9099
  3. Once Configuration#Concurrency may have been set (before config commit?) if it's still 0:
    • set a sane default (std::thread::hardware_concurrency() for now)
    • restart TP

@julianbrost
Copy link
Contributor

With that suggestion, would you have an explicit Start() method? And this somewhat sounds like you would start a temporary thread pool that shouldn't be used, just to tear it down again later.

@Al2Klimov
Copy link
Member Author

  1. Yes, an explicit Start() method.
  2. One can use it. But yes, it will be reset on concurrency change.

@Al2Klimov Al2Klimov removed the request for review from julianbrost May 4, 2023 08:14
@Al2Klimov Al2Klimov self-assigned this May 4, 2023
@Al2Klimov
Copy link
Member Author

Re: #9643 (comment)

Just saw that the I/O engine already has the same "problem". So this is fine.

@Al2Klimov Al2Klimov removed their assignment May 4, 2023
@Al2Klimov Al2Klimov requested a review from julianbrost May 4, 2023 08:53
@julianbrost
Copy link
Contributor

Just saw that the I/O engine already has the same "problem".

You mean it starts the threads automatically on demand?

@Al2Klimov
Copy link
Member Author

Exactly, that's why it's lazy-inited.

Copy link
Contributor

@julianbrost julianbrost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, I just noticed that you haven't changed anything at all here.

Just saw that the I/O engine already has the same "problem".

If component A has an undesired property and component B doesn't, obviously the way forward is to also introduce that undesired property in component B. Well, actually, it isn't.

lib/base/threadpool.hpp Show resolved Hide resolved
lib/base/threadpool.hpp Outdated Show resolved Hide resolved
@Al2Klimov
Copy link
Member Author

you would start a temporary thread pool that shouldn't be used, just to tear it down again later.

Not exactly. ThreadPool#Post() is called by Utility::QueueAsyncCallback() which is called by Timer::TimerThreadProc() due to including but not limited to WorkQueue#WorkQueue() used by ConfigItem::RunWithActivationContext() which is called pre-commit. Therefore I'll do #9643 (comment) .

@Al2Klimov Al2Klimov force-pushed the hardware_concurrency branch from 0e415b0 to 6e2b687 Compare May 11, 2023 15:07
@Al2Klimov
Copy link
Member Author

$ cat hardware_concurrency.conf
log(Concurrency)
$ prefix/sbin/icinga2 daemon -c hardware_concurrency.conf
[2023-05-11 17:06:24 +0200] information/cli: Icinga application loader (version: v2.11.0-527-g6e2b68789; debug)
[2023-05-11 17:06:24 +0200] information/cli: Loading configuration file(s).
[2023-05-11 17:06:24 +0200] information/config: 8
[2023-05-11 17:06:24 +0200] information/ConfigItem: Committing config item(s).
[2023-05-11 17:06:24 +0200] information/ConfigItem: Instantiated 1 IcingaApplication.
[2023-05-11 17:06:24 +0200] information/ScriptGlobal: Dumping variables to file '/Users/aklimov/NET/WS/icinga2/prefix/var/cache/icinga2/icinga2.vars'
[2023-05-11 17:06:24 +0200] information/ConfigItem: Triggering Start signal for config items
[2023-05-11 17:06:24 +0200] information/ConfigItem: Activated all objects.
^C[2023-05-11 17:06:29 +0200] information/Application: Received request to shut down.
[2023-05-11 17:06:29 +0200] information/Application: Shutting down...
[2023-05-11 17:06:29 +0200] information/ConfigObject: Dumping program state to file '/Users/aklimov/NET/WS/icinga2/prefix/var/lib/icinga2/icinga2.state'
[2023-05-11 17:06:29 +0200] information/IcingaApplication: Icinga has shut down.
$ prefix/sbin/icinga2 daemon -DConfiguration.Concurrency=9 -c hardware_concurrency.conf
[2023-05-11 17:06:53 +0200] information/cli: Icinga application loader (version: v2.11.0-527-g6e2b68789; debug)
[2023-05-11 17:06:53 +0200] information/cli: Loading configuration file(s).
[2023-05-11 17:06:53 +0200] information/config: 9
[2023-05-11 17:06:53 +0200] information/ConfigItem: Committing config item(s).
[2023-05-11 17:06:53 +0200] information/ConfigItem: Instantiated 1 IcingaApplication.
[2023-05-11 17:06:53 +0200] information/ScriptGlobal: Dumping variables to file '/Users/aklimov/NET/WS/icinga2/prefix/var/cache/icinga2/icinga2.vars'
[2023-05-11 17:06:53 +0200] information/ConfigObject: Restoring program state from file '/Users/aklimov/NET/WS/icinga2/prefix/var/lib/icinga2/icinga2.state'
[2023-05-11 17:06:53 +0200] information/ConfigObject: Restored 1 objects. Loaded 0 new objects without state.
[2023-05-11 17:06:53 +0200] information/ConfigItem: Triggering Start signal for config items
[2023-05-11 17:06:53 +0200] information/ConfigItem: Activated all objects.
^C[2023-05-11 17:06:57 +0200] information/Application: Received request to shut down.
[2023-05-11 17:06:57 +0200] information/Application: Shutting down...
[2023-05-11 17:06:57 +0200] information/ConfigObject: Dumping program state to file '/Users/aklimov/NET/WS/icinga2/prefix/var/lib/icinga2/icinga2.state'
[2023-05-11 17:06:57 +0200] information/IcingaApplication: Icinga has shut down.
$

@Al2Klimov
Copy link
Member Author

Also Activity Monitor confirms different thread amounts for -DConfiguration.Concurrency=1, nothing and -DConfiguration.Concurrency=42.

lib/base/threadpool.cpp Outdated Show resolved Hide resolved
lib/base/threadpool.cpp Outdated Show resolved Hide resolved
@Al2Klimov Al2Klimov force-pushed the hardware_concurrency branch from 6e2b687 to 6824ad8 Compare May 22, 2023 15:36
Copy link
Contributor

@julianbrost julianbrost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested it and should be fine apart from that small spelling detail below that doesn't have an effect on the functionality.

lib/base/threadpool.hpp Outdated Show resolved Hide resolved
The user (-D) or we could have changed Configuration.Concurrency,
so correct the thread pool's thread amount.
@Al2Klimov Al2Klimov force-pushed the hardware_concurrency branch from 6824ad8 to 3fae41e Compare May 23, 2023 12:41
@julianbrost julianbrost enabled auto-merge May 23, 2023 13:37
@julianbrost julianbrost merged commit 2470e93 into master May 23, 2023
@icinga-probot icinga-probot bot deleted the hardware_concurrency branch May 23, 2023 17:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla/signed core/quality Improve code, libraries, algorithms, inline docs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Icinga's initial threads are linked to core count in host with no way to adjust or limit
3 participants