-
Notifications
You must be signed in to change notification settings - Fork 318
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[question] Use cycles instead of ref-cycles #1470
Comments
/area koordlet |
@Rouzip Sorry for being late to answer. When CPI collector is implemented in koordlet, the reference is CPI2 : CPU performance isolation for shared compute clusters It was clarified in Chapter 3.1 as CPI data is derived from hardware counters, Besides, personally speaking, the CPU CLK UNHALTED.REF is not affected by thread frequency changes(CPU's dynamic frequency scaling mechanism). So I'll prefer this one but also want to hear more ideas from you. |
But the retired instructions in constant time doesn't reflect the performance of the application. Under the influence of similar technologies such as Intel SST, the calculation of CPI by the existing algorithm will lead to different results under different frequencies of the same machine, but it does not reflect the corresponding change in the performance of the application. |
Can you explain the difference between these two conters in detail, what is the calculation logic of these two indicators inside the CPU when the frequency changes? |
As mentioned by @songtao98 , ref-cycles do not vary based on CPU frequency and can be considered a constant value within a certain period of time. On the other hand, cycles do vary with CPU frequency. Assuming a change in CPU frequency while the program itself remains unchanged, this variation in CPU frequency within the original calculation formula would cause CPI to change. Consequently, it would fail to accurately reflect the scenario where the program itself has not changed. If I have made any mistakes, please help me identify them. Thank you. |
@Rouzip Actually, there are two types of workloads. Let's just say Web Service and Batch Job. They have different characters on loads within a fixed time periods. Web Service has a stable QPS, which means a (almost)fixed number of instructions to execute in 10s. If we just limit the scope on CPU frequency. So, maybe |
Beside, can we use the ref-cycles counter for calculating the normalized cpu utilization? |
Sorry for the late response. After conducting some experiments, I have found that both |
I think |
This is a very interesting question worth discussing. Assume a scenario where Pod A executes instructions that consume memory bandwidth (for example, occupying 50% of the memory access bandwidth), which affects the memory access efficiency of Pod B, causing the latency of Pod B to drop by 10%. If the CPU where Pod B resides increases the operating frequency by 15% due to the turbo mechanism, finally considering the combined impact of frequency and memory access, the performance of Pod B remains the same. In this scenario, Pod B is observed. Cycles has nothing to do with frequency, so we will see a significant increase (due to the impact of Pod A). The increase and decrease of ref-cycles is not clear, because it is affected by the combined impact of memory access efficiency and frequency improvement? |
/reopen |
@hormes: Reopened this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
From the experimental point of view, ref-cycles are not affected by frequency, and cycles are affected by frequency. The larger the turbo frequency, the larger the cycles, which is anti-correlated with the semantics of the CPI response. It seems that ref-cycles is more suitable here. @Rouzip |
/reopen |
@hormes: Reopened this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Good job! However, the CPI collected by koordinator may not be accurate at present, and perf needs to be used to collect it.
In this paper, they use CPI as a symptom, and they get this conclusion by statistical method. So if we want to carefully analyze the underlying reasons reflected by CPI changes (whether it is cycles or ref-cycles), more research is needed. |
Sorry for the wrong answer, we should use |
Thanks for the great job! @hormes
So the conclusion is your PR #1489 can fix this problem with solving PMU multiplexing.
And #1482 should be aborted for using Cycles?
And for this, we will work more for better analyzing. |
I made a mistake here. In fact, the relationship between the two is accurately described by this equation: Cycles are positively correlated with frequency, and ultimately CPI is positively correlated with frequency, not anticorrelated. Assuming that the frequency remains unchanged, the direct effect of the program being interfered is to slow down, that is, to run longer time, that is, both cycles and ref cycles can express this result. Therefore, when discussing this issue, the main concern is the situation of frequency change. For the same QPS online process, when a machine runs at 2.0G main frequency, its cycles are X, and for the same CPU model, when the frequency is 3.0G, the cycles are still X, that is, if the CPI is calculated by cycles, the two CPI for the case is the same, but obviously, the latency seen by the service on the two nodes is different. |
I will use another pr to fix perf group problem, #1489 is not enough. |
What happened:
Use ref-cycles as CPI factor.
What you expected to happen:
Use cycles as CPI factor.
Environment:
Anything else we need to know:
CPI is typically measured in cycles rather than ref-cycles as a performance evaluation metric.
References:
The text was updated successfully, but these errors were encountered: