Use sys.monitoring on supported versions (#520)

gaogaotiantian · Nov 17, 2024 · 1275073 · 1275073
1 parent 99b09bf
commit 1275073
Show file tree

Hide file tree

Showing 6 changed files with 576 additions and 141 deletions.
diff --git a/README.md b/README.md
@@ -14,6 +14,7 @@ More help can be found in "Support - Controls".
 
 * Detailed function entry/exit information on timeline with source code
 * Super easy to use, no source code change for most features, no package dependency
+* Low overhead, probably the fastest tracer in the market
 * Supports threading, multiprocessing, subprocess, async and PyTorch
 * Powerful front-end, able to render GB-level trace smoothly
 * Works on Linux/MacOS/Windows
@@ -193,7 +194,9 @@ VizTracer supports inserting custom events while the program is running. This wo
 
 ### Multi Thread Support
 
-VizTracer supports python native ```threading``` module without the need to do any modification to your code. Just start ```VizTracer``` before you create threads and it will just work.
+For Python3.12+, VizTracer supports Python-level multi-thread tracing without the need to do any modification to your code.
+
+For versions before 3.12, VizTracer supports python native ```threading``` module. Just start ```VizTracer``` before you create threads and it will just work.
 
 For other multi-thread scenarios, you can use ``enable_thread_tracing()`` to notice VizTracer about the thread to trace it.
 
@@ -238,39 +241,32 @@ VizTracer needs to dump the internal data to json format. It is recommended for
 
 ## Performance
 
-VizTracer will introduce 2x to 3x overhead in the worst case. The overhead is much better if there are less function calls or if filters are applied correctly.
+VizTracer puts in a lot of effort to achieve low overhead. The actual performance impact largely depends on your application.
+For typical codebases, the overhead is expected to be below 1x. If your code has infrequent function calls,
+the overhead could be minimal.
 
 <details>
 
 <summary>
-An example run for test_performance with Python 3.8 / Ubuntu 18.04.4 on Github VM
+Detailed explanation
 </summary>
 
-```sh
-fib:
-0.000678067(1.00)[origin]
-0.019880272(29.32)[py] 0.011103901(16.38)[parse] 0.021165599(31.21)[json]
-0.001344933(1.98)[c] 0.008181911(12.07)[parse] 0.015789866(23.29)[json]
-0.001472846(2.17)[cProfile]
-
-hanoi     (6148, 4100):
-0.000550255(1.00)[origin]
-0.016343521(29.70)[py] 0.007299123(13.26)[parse] 0.016779364(30.49)[json]
-0.001062505(1.93)[c] 0.006416136(11.66)[parse] 0.011463236(20.83)[json]
-0.001144914(2.08)[cProfile]
-
-qsort     (8289, 5377):
-0.002817679(1.00)[origin]
-0.052747431(18.72)[py] 0.011339725(4.02)[parse] 0.023644345(8.39)[json]
-0.004767673(1.69)[c] 0.008735166(3.10)[parse] 0.017173703(6.09)[json]
-0.007248019(2.57)[cProfile]
-
-slow_fib  (1135, 758):
-0.028759652(1.00)[origin]
-0.033994071(1.18)[py] 0.001630461(0.06)[parse] 0.003386635(0.12)[json]
-0.029481623(1.03)[c] 0.001152415(0.04)[parse] 0.002191417(0.08)[json]
-0.028289305(0.98)[cProfile]
-```
+The overhead introduced by VizTracer is basically a fixed amount of time during function entry and exit, so the more time spent on
+function entries and exits, the more overhead will be observed. A pure recursive ```fib``` function could suffer 3x-4x overhead
+on Python3.11+ (when the Python call is optimized, before that Python call was slower so the overhead ratio would be less).
+
+In the real life scenario, your code should not spend too much time on function calls (they don't really do anything useful), so
+the overhead would be much smaller.
+
+Many techniques are applied to minimize the overall overhead during code execution to reduce the inevitable skew introduced by
+VizTracer (the report saving part is not as critical). For example, VizTracer tries to use the CPU timestamp counter instead of
+a syscall to get the time when available. On Python 3.12+, VizTracer uses ```sys.monitoring``` which has less overhead than
+```sys.setprofile```. All of the efforts made it observably faster than ```cProfile```, the Python stdlib profiler.
+
+However, VizTracer is a tracer, which means it has to record every single function entry and exit, so it can't be as fast as
+the sampling profilers - they are not the same thing. With the extra overhead, VizTracer provides a lot more information than
+normal sampling profilers.
+
 </details>
 
 ## Documentation

diff --git a/docs/source/concurrency.rst b/docs/source/concurrency.rst
@@ -25,7 +25,9 @@ Just start ``VizTracer`` before you create threads and it will just work.
 other multi-thread
 ------------------
 
-If you are using multi-thread via other mechanism, for example, PyQt thread, VizTracer can't support it out of the box.
+For python3.12+, VizTracer supports all Python level multi-thread. You don't need to do anything.
+
+For the version before 3.12, if you are using multi-thread via other mechanism, for example, PyQt thread, VizTracer can't support it out of the box.
 However, you can notice VizTracer that you are in a separate thread and enable tracing in that thread with ``enable_thread_tracing``
 
 .. code-block:: python