-
Notifications
You must be signed in to change notification settings - Fork 393
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🐛 0.10.x introduced sysinfo
crate result in very slow speed under Linux
#839
Comments
sysinfo
crate result in very bad performance under Linuxsysinfo
crate result in very slow under Linux
sysinfo
crate result in very slow under Linuxsysinfo
crate result in very slow speed under Linux
I've just test and verified the same problem on another Linux computer, 6 core CPU a simple delta diff cost about 100ms (with 6 logic core cpu) so for each core, get_cpu_frequency() call cost about 15ms, if you CPU core is less than 10, this problem is not easy to found, because you are not very obvious to feel the delay gap. basically, I think this affects all Linux users. |
a simple demo to re-produce the issueI also made a simple demo to show that https://github.com/ttys3/get-cpu-frequency-slow/releases/tag/v0.1.0 usage: logic_CPU_num default to I simply write if you need build: git clone https://github.com/ttys3/get-cpu-frequency-slow
make build
make run demo result: most of the time it will look like this (for 16 logic core CPU): ./target/release/get-cpu-frequency-slow
total time elapsed in get_cpu_frequency() is: 264.490103ms
❯ make run
./target/release/get-cpu-frequency-slow
total time elapsed in get_cpu_frequency() is: 264.542344ms
❯ make run
./target/release/get-cpu-frequency-slow
total time elapsed in get_cpu_frequency() is: 265.281557ms
❯ make run
./target/release/get-cpu-frequency-slow
total time elapsed in get_cpu_frequency() is: 264.696121ms this result is very similar to my previous tests result:
but there's also rarely times to result in very good result, like this (but I think this result can just ignored):
|
using benchmark tool like hyperfine will NOT present the problem, something like the reason is it shows the result like this (6 core machine): ❯ hyperfine -r 1000 'delta Cargo.toml Cargo.toml.bak'
Benchmark 1: delta Cargo.toml Cargo.toml.bak
Time (mean ± σ): 55.3 ms ± 8.7 ms [User: 47.1 ms, System: 5.8 ms]
Range (min … max): 50.7 ms … 149.7 ms 1000 runs but in real world, users will not always run in real world, it is more like this: I use a simple for (( c=1; c<=10; c++ ))
do
sleep 1
time delta Cargo.toml Cargo.toml.bak > /dev/null
done the result on a 6 core CPU, most of the execution cost >= # this test run under a 6 core CPU machine
❯ for (( c=1; c<=10; c++ ))
do
sleep 1
time delta Cargo.toml Cargo.toml.bak > /dev/null
done
delta Cargo.toml Cargo.toml.bak > /dev/null 0.05s user 0.00s system 62% cpu 0.091 total
delta Cargo.toml Cargo.toml.bak > /dev/null 0.05s user 0.00s system 45% cpu 0.122 total
delta Cargo.toml Cargo.toml.bak > /dev/null 0.05s user 0.00s system 41% cpu 0.136 total
delta Cargo.toml Cargo.toml.bak > /dev/null 0.04s user 0.01s system 52% cpu 0.110 total
delta Cargo.toml Cargo.toml.bak > /dev/null 0.05s user 0.01s system 39% cpu 0.142 total
delta Cargo.toml Cargo.toml.bak > /dev/null 0.05s user 0.01s system 45% cpu 0.122 total
delta Cargo.toml Cargo.toml.bak > /dev/null 0.05s user 0.00s system 39% cpu 0.143 total
delta Cargo.toml Cargo.toml.bak > /dev/null 0.05s user 0.01s system 46% cpu 0.121 total
delta Cargo.toml Cargo.toml.bak > /dev/null 0.05s user 0.01s system 101% cpu 0.055 total
delta Cargo.toml Cargo.toml.bak > /dev/null 0.05s user 0.00s system 40% cpu 0.141 total in a benchmark situation, it is more like this (with out the # this test run under a 6 core CPU machine
❯ for (( c=1; c<=10; c++ ))
do
time delta Cargo.toml Cargo.toml.bak > /dev/null
done
delta Cargo.toml Cargo.toml.bak > /dev/null 0.05s user 0.01s system 100% cpu 0.055 total
delta Cargo.toml Cargo.toml.bak > /dev/null 0.05s user 0.01s system 101% cpu 0.054 total
delta Cargo.toml Cargo.toml.bak > /dev/null 0.05s user 0.00s system 101% cpu 0.055 total
delta Cargo.toml Cargo.toml.bak > /dev/null 0.05s user 0.01s system 100% cpu 0.056 total
delta Cargo.toml Cargo.toml.bak > /dev/null 0.05s user 0.01s system 100% cpu 0.055 total
delta Cargo.toml Cargo.toml.bak > /dev/null 0.04s user 0.02s system 101% cpu 0.055 total
delta Cargo.toml Cargo.toml.bak > /dev/null 0.05s user 0.01s system 90% cpu 0.063 total
delta Cargo.toml Cargo.toml.bak > /dev/null 0.06s user 0.00s system 85% cpu 0.068 total
delta Cargo.toml Cargo.toml.bak > /dev/null 0.05s user 0.00s system 99% cpu 0.058 total
delta Cargo.toml Cargo.toml.bak > /dev/null 0.05s user 0.01s system 100% cpu 0.055 total |
@ttys3 and @th1000s do you understand why the fn new_with_specifics(refreshes: RefreshKind) -> System {
// ...
if !refreshes.cpu() {
s.refresh_processors(None); // We need the processors to be filled.
}
s.refresh_specifics(refreshes);
// ...
} |
Would one possibility be to use one of the competing Rust crates, for example https://docs.rs/heim/0.0.11/heim/process/struct.Process.html? (Or is there no attractive / performant way to call those functions from sync code?) |
I've opened GuillaumeGomez/sysinfo#632 to ask whether there's any scope for avoiding the CPU data querying. |
* Do not query CPU data when querying process data Fixes #839 Ref GuillaumeGomez/sysinfo#632 * Update branch of sysinfo * Update upstream sysinfo commit Ref GuillaumeGomez/sysinfo#636 * Point sysinfo at an explicit commit rather than a symbolic branch name commit d647acfbf216848a8237e1f9251b2c48860a547f Merge: 989ac6c 67a586c Author: Guillaume Gomez <guillaume1.gomez@gmail.com> Date: 2 hours ago Merge pull request #636 from GuillaumeGomez/update-if-needed Only update processors if needed
brief description
0.10.x introduced
sysinfo
crate result in very slow speed under Linux.a simple delta diff cost about
300ms
(with 16 logic core cpu)Append:
just test on another Linux computer, 6 core CPU
a simple delta diff cost about
100ms
(with 6 logic core cpu)so for each core,
get_cpu_frequency()
call cost about15ms
, if you CPU core is less than 10, this problem is not easy to found, because you are not very obvious to feel the delay gap.for any >= 0.10.x version, the more (logic) CPU cores you have, the more slower you get
this issue is also mentioned in #841 (comment)
I also tried delta-0.9.2-x86_64-unknown-linux-gnu (https://github.com/dandavison/delta/releases/download/0.9.2/delta-0.9.2-x86_64-unknown-linux-gnu.tar.gz)
which works without slowdown problem on my machine.
the problem seems to be
get_cpu_frequency()
call insysinfo
cratemy patched version resolved the slowdown problem: https://github.com/ttys3/delta/releases/tag/0.11.2
related commit: ttys3@8f5a835
ttys3/sysinfo@53cf3dd
since this need to patch the
sysinfo
crate, it is hard to make anormal
PR -_-my investigation
here's my env:
my git config related to delta:
1. test with git diff
with delta 0.11.1, do
time git diff Cargo.toml
cost about0.5
s each time (repeated 3 times):with delta 0.11.2, do
time git diff Cargo.toml
cost about0.3
s each time (repeated 3 times):2. without git, run delta directly
time delta Cargo.toml.old Cargo.toml
delta 0.11.1
delta 0.11.2
the result keep the same. so it does not related to git.
the normal speed should be near
0.001s
, just like this (for exampletime diff Cargo.toml.old Cargo.toml
):I have found the root cause.
it is
delta::utils::process::determine_calling_process
expand the tower top:
so the reason is clear:
oh NO NO NO NO, it try to detect the cpu frequency !!!!
just disable the call, the speed is up:
the reason is:
delta
src/utils/process.rs
delta/src/utils/process.rs
Line 230 in a55f021
and
sysinfo::System::new()
is:new_with_specifics
will always try torefresh_processors
:by default,
!refreshes.cpu()
result in!false
, so it istrue
:there seems no way to forbbide
refresh_processors
call.if we change the new to
sysinfo::System::new_with_specifics(sysinfo::RefreshKind::new().with_cpu())
the first
s.refresh_processors(None);
will not get called.but in
s.refresh_specifics(refreshes);
, it get called again.The text was updated successfully, but these errors were encountered: