Skip to content

Commit

Permalink
Use sys.monitoring on supported versions (#520)
Browse files Browse the repository at this point in the history
  • Loading branch information
gaogaotiantian authored Nov 17, 2024
1 parent 99b09bf commit 1275073
Show file tree
Hide file tree
Showing 6 changed files with 576 additions and 141 deletions.
52 changes: 24 additions & 28 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ More help can be found in "Support - Controls".

* Detailed function entry/exit information on timeline with source code
* Super easy to use, no source code change for most features, no package dependency
* Low overhead, probably the fastest tracer in the market
* Supports threading, multiprocessing, subprocess, async and PyTorch
* Powerful front-end, able to render GB-level trace smoothly
* Works on Linux/MacOS/Windows
Expand Down Expand Up @@ -193,7 +194,9 @@ VizTracer supports inserting custom events while the program is running. This wo

### Multi Thread Support

VizTracer supports python native ```threading``` module without the need to do any modification to your code. Just start ```VizTracer``` before you create threads and it will just work.
For Python3.12+, VizTracer supports Python-level multi-thread tracing without the need to do any modification to your code.

For versions before 3.12, VizTracer supports python native ```threading``` module. Just start ```VizTracer``` before you create threads and it will just work.

For other multi-thread scenarios, you can use ``enable_thread_tracing()`` to notice VizTracer about the thread to trace it.

Expand Down Expand Up @@ -238,39 +241,32 @@ VizTracer needs to dump the internal data to json format. It is recommended for

## Performance

VizTracer will introduce 2x to 3x overhead in the worst case. The overhead is much better if there are less function calls or if filters are applied correctly.
VizTracer puts in a lot of effort to achieve low overhead. The actual performance impact largely depends on your application.
For typical codebases, the overhead is expected to be below 1x. If your code has infrequent function calls,
the overhead could be minimal.

<details>

<summary>
An example run for test_performance with Python 3.8 / Ubuntu 18.04.4 on Github VM
Detailed explanation
</summary>

```sh
fib:
0.000678067(1.00)[origin]
0.019880272(29.32)[py] 0.011103901(16.38)[parse] 0.021165599(31.21)[json]
0.001344933(1.98)[c] 0.008181911(12.07)[parse] 0.015789866(23.29)[json]
0.001472846(2.17)[cProfile]

hanoi (6148, 4100):
0.000550255(1.00)[origin]
0.016343521(29.70)[py] 0.007299123(13.26)[parse] 0.016779364(30.49)[json]
0.001062505(1.93)[c] 0.006416136(11.66)[parse] 0.011463236(20.83)[json]
0.001144914(2.08)[cProfile]

qsort (8289, 5377):
0.002817679(1.00)[origin]
0.052747431(18.72)[py] 0.011339725(4.02)[parse] 0.023644345(8.39)[json]
0.004767673(1.69)[c] 0.008735166(3.10)[parse] 0.017173703(6.09)[json]
0.007248019(2.57)[cProfile]

slow_fib (1135, 758):
0.028759652(1.00)[origin]
0.033994071(1.18)[py] 0.001630461(0.06)[parse] 0.003386635(0.12)[json]
0.029481623(1.03)[c] 0.001152415(0.04)[parse] 0.002191417(0.08)[json]
0.028289305(0.98)[cProfile]
```
The overhead introduced by VizTracer is basically a fixed amount of time during function entry and exit, so the more time spent on
function entries and exits, the more overhead will be observed. A pure recursive ```fib``` function could suffer 3x-4x overhead
on Python3.11+ (when the Python call is optimized, before that Python call was slower so the overhead ratio would be less).

In the real life scenario, your code should not spend too much time on function calls (they don't really do anything useful), so
the overhead would be much smaller.

Many techniques are applied to minimize the overall overhead during code execution to reduce the inevitable skew introduced by
VizTracer (the report saving part is not as critical). For example, VizTracer tries to use the CPU timestamp counter instead of
a syscall to get the time when available. On Python 3.12+, VizTracer uses ```sys.monitoring``` which has less overhead than
```sys.setprofile```. All of the efforts made it observably faster than ```cProfile```, the Python stdlib profiler.

However, VizTracer is a tracer, which means it has to record every single function entry and exit, so it can't be as fast as
the sampling profilers - they are not the same thing. With the extra overhead, VizTracer provides a lot more information than
normal sampling profilers.

</details>

## Documentation
Expand Down
4 changes: 3 additions & 1 deletion docs/source/concurrency.rst
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,9 @@ Just start ``VizTracer`` before you create threads and it will just work.
other multi-thread
------------------

If you are using multi-thread via other mechanism, for example, PyQt thread, VizTracer can't support it out of the box.
For python3.12+, VizTracer supports all Python level multi-thread. You don't need to do anything.

For the version before 3.12, if you are using multi-thread via other mechanism, for example, PyQt thread, VizTracer can't support it out of the box.
However, you can notice VizTracer that you are in a separate thread and enable tracing in that thread with ``enable_thread_tracing``

.. code-block:: python
Expand Down
Loading

0 comments on commit 1275073

Please sign in to comment.