Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance problem cased by tuned service(plugin_cpu.py) #588

Closed
sundyyuan opened this issue Jan 16, 2024 · 10 comments · Fixed by #603
Closed

Performance problem cased by tuned service(plugin_cpu.py) #588

sundyyuan opened this issue Jan 16, 2024 · 10 comments · Fixed by #603

Comments

@sundyyuan
Copy link

Hi Dear,
I met a problem during LAPACK workload(single core test) test with Ubuntu 20.04.6 LTS (Focal Fossa) +kernel 5.4.0-166. Non-busy core will switch C6 to C1E if tuned service was enabled with intel_idle driver, and non-busy core will always keep to C6 status if tuned service disabled.
There has “tuned.plugins.plugin_cpu: setting new cpu latency 100” printed into /var/log/tuned/tuned.log with C6 switch to C1E. The related code(https://github.com/redhat-performance/tuned/blob/master/tuned/plugins/plugin_cpu.py) about the latency_low=100 from tuned.plugins. C6 latency is 170 in intel_idle driver, that less than 100 lead to non-busy core can’t keep C6 state. Is that a correct behavior from tuned.plugins? we expect all non-busy core is in C6 states to get a best performance.

Thanks
Sundy

@sundyyuan
Copy link
Author

intel_idle driver

@yarda
Copy link
Contributor

yarda commented Jan 17, 2024

Which TuneD profile? There are some low latency profiles which intentionally prevents CPU from entering deeper C-states for low latency (force_latency and pm_qos_resume_latency_us cpu plugin options).

@sundyyuan
Copy link
Author

we set profile to throughput-performance, and I have tried change profile to powersave or balanced have the same behaviour. And can't enter C6 issue in non-busy core problem will lost if disable tuned service.

@yarda
Copy link
Contributor

yarda commented Jan 20, 2024

throughput-performance shouldn't touch it. Could you provide debug output?

# systemctl stop tuned
# tuned -D

@sundyyuan
Copy link
Author

[
tuned-D.txt
](url)

@sundyyuan
Copy link
Author

tuned -D output was upload.

@yarda
Copy link
Contributor

yarda commented Jan 22, 2024

Please disable dynamic tuning in /etc/tuned/tuned-main.conf:

dynamic_tuning = 0

This is the default for RHEL/CentOS, we will probably reconsider the defaults for Fedora/upstream. With dynamic_tuning=1 it controls PM QoS latency according to the machine load, which on modern platforms may not be the best thing to do.

@sundyyuan
Copy link
Author

Non-busy will always keep C6 states after setting dynamic_tuning = 0 in /etc/tuned/tuned-main.conf and restart tuned service. So dynamic_tuning = 0 default setting will be changed in the newest tuned tool right?

@yarda
Copy link
Contributor

yarda commented Jan 25, 2024

Non-busy will always keep C6 states after setting dynamic_tuning = 0 in /etc/tuned/tuned-main.conf and restart tuned service. So dynamic_tuning = 0 default setting will be changed in the newest tuned tool right?

We will probably unify the settings with RHEL/CentOS, i.e. we will switch to dynamic_tuning = 0 in upstream, but there is no ETA for this change. In Fedora, if you change it yourself, the package manager will not overwrite your settings during updates.

@sundyyuan
Copy link
Author

Thanks for your comment. We will manually set dynamic_tuning = 0 in Ubuntu at now.

yarda added a commit to yarda/tuned that referenced this issue Feb 13, 2024
Dynamic tuning is PoC implementation and it can cause many problems
especially with some networks drivers where it can interrupt network
connections and also with modern CPUs where it can worsen power
consumption by limiting CPUs from entering deeper C-states. Now, when
TuneD is going to replace power-profiles-daemon these problems can
accumulate and cause bad user experience.

RHEL disables dynamic tuning downstream for a long time, so follow it
and also disable it by default upstream. People who knows what they are
doing can still enable it.

Fixes redhat-performance#588

Signed-off-by: Jaroslav Škarvada <jskarvad@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants