- https://github.com/johnkerl/miller/blob/readme-profiling/README-dev.md#performance-optimizations
- https://miller.readthedocs.io/en/latest/new-in-miller-6/#performance-benchmarks.
make bench
to run Go benchmarks for Miller
Run the profiler:
mlr --cpuprofile cpu.pprof --csv put -f scripts/chain-1.mlr ~/tmp/big.csv > /dev/null
(or whatever command-line flags for Miller).
Text mode:
go tool pprof mlr cpu.pprof
top10
Graphical mode:
go tool pprof -http=:8080 cpu.pprof
and let it pop open a browser window. Then navigate there -- I personally find View -> Flame Graph most useful:
Note that you can drill into subcomponents of the flame graph:
Scripts:
- ./scripts/make-big-files -- Create million-record data files in various formats.
- ./scripts/chain-cmps.sh -- Run a few processing scenarios on the million-record CSV file.
- ./scripts/chain-1.mlr -- An example
mlr put
used by the previous script
- ./scripts/chain-1.mlr -- An example
- ./scripts/time-big-files -- Runs
mlr cat
for million-record files of various file formats. Catting files isn't intrinsically interesting but it shows how input and output processing vary over file formats.- ./scripts/time-big-file -- Helper script for the former.
- ./scripts/chain-lengths.sh -- Run longer and longer chains of
scripts/chain1.mlr
, showing how Miller handles multicore and concurrency. - ./scripts/make-data-stream -- Create an endless stream of data to be piped into Miller for steady-state load-testing: e.g.
scripts/make-data-stream | mlr ...
then look athtop
in another window.
Notes:
- Any of the above can be run using the profiler. I find Flame Graph mode particularly informative for drill-down.
- The above refer to
mlr5
and~/tmp/miller/mlr
as well as./mlr
. The idea is I have a copy of Miller 5.10.3 (the C implementation) saved off in my path asmlr5
. Then I keep~/tmp/miller
on recent HEAD. Then I have.
on a dev branch. Comparingmlr5
to./mlr
shows relative performance of the C and Go implementations. Comparing~/tmp/miller/mlr
to./mlr
shows relative performance of whatever optimization I'm currently working on. - Several of the above scripts use justtime to get one-line timing information.
- ./scripts/compiler-versions-install
- ./scripts/compiler-versions-build
- ./scripts/compiler-versions-time
# Note 100 is the default
# Raise the bar for GC threshold:
GOGC=200 GODEBUG=gctrace=1 mlr -n put -q -f u/mand.mlr 1> /dev/null
# Raise the bar higher for GC threshold:
GOGC=1000 GODEBUG=gctrace=1 mlr -n put -q -f u/mand.mlr 1> /dev/null
# Turn off GC entirely and see where time is spent:
GOGC=off GODEBUG=gctrace=1 mlr -n put -q -f u/mand.mlr 1> /dev/null