-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add performance tracking machinery #1509
base: main
Are you sure you want to change the base?
Conversation
This commit adds the basic functionality needed for measuring performance. This includes: - classes to measure performance in different sessions using context managers - a context manager that can be used to instrument certain methods with performance tracking machinery - test fixtures for automatically generating reports using these classes
This commit adds `Comparison` classes which contain comparisons between performance measurements taken at two different points in time. It uses these classes in a new script that can be called to generate a markdown report of what changed regarding performance.
Thank you for your pull request! We could not find a changelog entry for this change. For details on how to document a change, see the contributing guide. |
Performance comparisonComparing Worst performance hit: -3.93% in
|
context | CPU avg | CPU median | CPU max | Wall avg | Wall median | Wall max |
---|---|---|---|---|---|---|
DataflowPlanBuilder.build_plan |
-0.0318ms (-7.27%) | -0.0318ms (-7.27%) | -0.0318ms (-7.27%) | -0.0383ms (-8.76%) | -0.0383ms (-8.76%) | -0.0383ms (-8.76%) |
global |
-0.0583ms (-5.97%) | -0.0583ms (-5.97%) | -0.0583ms (-5.97%) | -0.0642ms (-6.56%) | -0.0642ms (-6.56%) | -0.0642ms (-6.56%) |
test_simple_manifest.py::test_simple_query
🚀
context | CPU avg | CPU median | CPU max | Wall avg | Wall median | Wall max |
---|---|---|---|---|---|---|
DataflowPlanBuilder.build_plan |
-0.0377ms (-7.41%) | -0.0377ms (-7.41%) | -0.0377ms (-7.41%) | -0.0281ms (-5.40%) | -0.0281ms (-5.40%) | -0.0281ms (-5.40%) |
global |
-0.0572ms (-5.63%) | -0.0572ms (-5.63%) | -0.0572ms (-5.63%) | -0.0406ms (-3.93%) | -0.0406ms (-3.93%) | -0.0406ms (-3.93%) |
start_wall = time.time_ns() | ||
start_cpu = time.process_time_ns() | ||
|
||
yield self |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
High level question - what's the advantage of doing this ourself instead of using the output of cProfile
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly I just didn't really know how cProfile
worked and this looked easier. Happy to change how PerformanceTracker
works to use cProfile
under the hood. We can reuse the thing that compares reports though.
Summary
This PR introduces some new testing functionality that we can use to avoid performance regressions. From a user perspective, the performance benchmarking workflow is:
performance_helpers.py
tests_metricflow/performance/
that does something we want to track performance for. You can use the following fixtures:measure_compilation_performance
: measure the compilation of some SQL based on a query spec and a manifest.perf_tracker
: a "lower level" fixture that you can usd to create measurements and sessions (more on that later)perf_tracker
base thing.make perf
to generate aperformance-report.json
make perf-compare A=base-report.json B=other-report.json
to generate a Markdown file with a report on changes.The performance report
The
performance-report.json
file contains the "raw" data for a given run. It looks like this:Each "report" contains a list of "sessions". A session is a run where multiple measurements have been taken. Each session contains a variety of "contexts", which are aggregations (avg, mean, max) of CPU and wall time measurements based on a
context_id
such asDataflowPlanBuilder.build_plan
. We can add memory tracking later using psutil.Report comparison
I created a script called
compare_reports.py
which takes in two different reports files and calculates the difference between them (in absolute values and also percent). It then generates a markdown file calledperformance-comparison.md
with a nice little table like the one on my first comment.How it works
When you run
make perf
, theperf_tracker
fixture uses thetrack_performance
context manager, whichPerformanceTracker
_tracking_class_methods
and all its occurrences if it's already been imported by other methods. I usedunittest.mock.patch
for this.The tests then all run, and at the end of it all the fixture compiles all reports by calling
get_report_set()
. This will generate theContextReport
s,SessionReport
s andSessionReportSet
based on the raw list ofCall
s to the tracked methods.After this, it writes the report to the JSON file.
Then, when you run
make perf-compare
the script loads both the JSON files and uses thecompare()
methods in each of the report classes. These will generateComparison
classes with the percent and absolute differences. The script then renders these classes to markdown and writes it to a file.Next steps
Make a Github Actions bot that runs this on PRs to compare performance against
main
.