-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve performance reports #81
Conversation
AlexWaygood
commented
May 2, 2023
- Differences less than 10 seconds seem to happen quite regularly just due to random variations in GitHub Actions, so don't report performance changes if the difference is less than 10 seconds. We're only really interested in major improvements or regressions here (the variations will be too big for anything more precise), and major improvements or regressions will generally manifest themselves in differences >10 seconds.
- Clarify that it's mypy that got faster/slower, not the package we're running mypy on :-)
Thanks, this looks good! Agreed that we're only going to be able to detect really large shifts. That said, I'd like to hold off a week or two before merging, I'm finding it interesting to see how much noise there is. |
This has some big merge conflicts now -- are you still interested in this/should I rebase? :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure, what do you think?
Biased pro: I liked that all the primer messages on python/mypy#15128 showed a speed up. These were all less than 10s wins (but I think they were real because results were consistent on different commits in the PR).
Con: I've seen some noise specifically on the vision
project. I don't remember a lot of other noise, but I've also been a little short of time so haven't been looking too closely.
I feel like I see noisy output from the performance report fairly regularly and I basically trained myself to ignore it. |
Yeah, I agree that there have been PRs where we can be confident that <10s speedups were real, because they were very consistent in the mypy_primer reports. However I feel like I've seen more PRs with random noise from the performance reports, and I think those can be pretty confusing to new contributors (especially at typeshed, where it's pretty unlikely that something will have a significant performance impact unless somebody's adding a massive Union somewhere). So I guess I can see it both ways? Maybe the ideal would be if the artifact contained a complete performance report, but the comment posted on PRs filtered out any differences <10s. That would be more complex to implement, of course. Whatever the case, I do think it would be good to clarify that it's the typechecking of these projects that's getting faster/slower — not sure the wording is ideal at the moment 😄 |
IMO we should disable performance reports in typeshed, maybe with a command-line flag to mypy_primer? We don't really care if one package becomes slightly slower, as long as it is reasonably fast overall. This is different in mypy, because changes in mypy affect all/most packages, and also pile up unlike improvements to different packages.
Mypy handles big unions reasonably fast in most practical cases. There is a fast path that skips the bottleneck, and it has been expanded to cover pretty much anything that comes up in practice. |
Hmm, well, every typeshed sync over at mypy at the moment, we religiously cherry-pick a commit to revert a change I made to typeshed a while back that allegedly caused a large performance regression for mypy due to the use of a big union: https://github.com/python/mypy/blob/e14cddbe31c9437502acef022dca376a25d0b50d/misc/sync-typeshed.py#L183 See e.g. the third commit in python/mypy#15165 If mypy really has no performance problems with big unions, we should probably stop doing that :D |
Consensus seems it's too noisy, so let's up the threshold to 10s (like this PR does), and see how we feel about noise? I've definitely caused huge perf regressions with typeshed changes, so I think there's value to it in typeshed if the number can be believable. I agree that it makes sense to target wording at infrequent contributors, maybe something like "in a single noisy sample, checking was 1.14x slower" re the sum literal change: Could be worth trying out again? I've lost track of what it was trying to solve though and I haven't yet spotted what is undesirable in mypy's current behaviour |
Apparently I was wrong about big unions :) They used to be really broken. Specifically, there was a nested for loop that did |
Yeah, I improved two more union perf things in mypy recently, so it might be better now |
Cool I'll update this PR based on the consensus here! |