-
Notifications
You must be signed in to change notification settings - Fork 254
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stackprof: weight on-cpu samples by period rather than timestamp delta #425
Stackprof: weight on-cpu samples by period rather than timestamp delta #425
Conversation
Hi @manuelfelipe! This seems reasonable to me! Can you please include tests which would catch changes to this behaviour were it to break. I think this would mean including a new profile with the |
Alright, added a sample profile along with a test asserting for the desired behaviour in 7ad10a1 On master, it fails (as expected) with:
Also added couple of changes to the simple ruby app to allow changing the mod via an arg value, and some sleep time to showcase how weight is influenced when capturing cpu profiles After |
Thanks for adding the test and ensuring that it fails on the main branch! Looks like there are failing test runs & a merge conflict that needs resolving before this can land. |
… on sample * period rather than wall timestamp deltas
7ad10a1
to
2b92d51
Compare
thanks @jlfwong. Rebased base on latest changes on main, and re-ran Not sure I fully understand what is the actual thing failing failure in https://github.com/jlfwong/speedscope/actions/runs/5201005116/jobs/9543115054. Locally tests are passing, so hoping the thing that was causing the issue are the outdated snapshots. |
FYI @jlfwong I enabled actions on my fork, Manuel pushed to it, and it passes there https://github.com/dalehamel/speedscope/actions/runs/5280170594 (since we cannot run workflows here, this seemed the most expedient way). |
Thanks! This will go out with the next release |
This is now live on https://speedscope.app and published to npm as part of v1.15.2. Thanks for your contribution! |
jlfwong#425) This attempts to improve the quality of the on-CPU profiles stackprof provides. Rather than weighing samples by their timestamp deltas, which, in our opinion, are only valid in wall-clock mode, this weighs callchains by: ``` S = number of samples P = sample period in nanoseconds W = S * P ``` The difference after this change is quite substantial, specially in profiles that previously were showing up with heavy IO frames: * Total profile weight is almost down by 90%, which actually makes sense for an on-CPU profile if the app is relatively idle * Certain callchains that blocked in syscalls / IO are now much lower weight. This was what I was expecting to find. Here is an example of the latter point. In delta mode, we see an io select taking a long time, it is a significant portion of the profile: <img width="1100" alt="236936508-709bee01-d616-4246-ba74-ab004331dcd3" src="https://github.com/dalehamel/speedscope/assets/4398256/39140f1e-50a9-4f33-8a61-ec98b6273fd4"> But in period scaling mode, it is only a couple of sample periods ultimately: <img width="206" alt="236936693-9d44304e-a1c2-4906-b3c8-50e19e6f9f27" src="https://github.com/dalehamel/speedscope/assets/4398256/7d19077f-ef25-4d79-980b-cfa1775d928d">
This attempts to improve the quality of the on-CPU profiles stackprof provides. Rather than weighing samples by their timestamp deltas, which, in our opinion, are only valid in wall-clock mode, this weighs callchains by:
The difference after this change is quite substantial, specially in profiles that previously were showing up with heavy IO frames:
Here is an example of the latter point.
In delta mode, we see an io select taking a long time, it is a significant portion of the profile:
But in period scaling mode, it is only a couple of sample periods ultimately: