python function tracing support #436

namhyung · 2018-08-08T15:29:06Z

This is a RFC for tracing python function using python's sys.settrace() hook. I implemented C extension module to be called from python and it calls __cyg_profile_func_enter/exit() like a normal C program. But python functions don't have fixed address so I just used a sequence counter to give it the address. uftrace will treat any functions that have address under 4K as python functions.

As it's just a PoC, I don't expect it can run serious programs and it'd contain many rough edges. Also filters don't work on the python functions. The --python option was added tentatively to work around issues on the python script as an executable. Maybe it'd be nice if it could detect and do the right things automatically.

I only tested it on my Archlinux machine but it doesn't have PLT in the python interpreter. So it only shows python functions but I expect it can also show (library) functions in the interpreter too.

I pushed the code to review/python-tracing-v1 branch. Have fun :)

Here's my test program (test.py):

#!/usr/bin/env python2
def b(arg):
    return arg+1
def a():
    return b(0)
print a()

You can run it like below:

$ uftrace --python test.py
1
# DURATION     TID     FUNCTION
            [   754] | <module>() {
            [   754] |   a() {
   0.911 us [   754] |     b();
   3.768 us [   754] |   } /* a */
  11.222 us [   754] | } /* <module> */
   6.015 us [   754] | _remove();
   2.397 us [   754] | _remove();

The text was updated successfully, but these errors were encountered:

namhyung · 2018-08-09T00:49:00Z

tested on an Ubuntu 16.04 box:

$ uftrace record --python test.py

# uftrace replay -t 1us -F '<module>'
# DURATION     TID     FUNCTION
            [  1236] | <module>() {
            [  1236] |   a() {
   4.963 us [  1236] |     b();
  16.306 us [  1236] |   } /* a */
  11.269 us [  1236] |   __fprintf_chk();
  17.109 us [  1236] |   fputs();
  81.222 us [  1236] | } /* <module> */

It seems filter is needed for replay..

honggyukim · 2018-08-09T02:43:33Z

Looks great! It works fine in my environment but it shows too many library calls.

$ uftrace --python hello.py
1
# DURATION     TID     FUNCTION
   1.710 us [188685] | strcmp();
   0.337 us [188685] | strcmp();
   1.610 us [188685] | strchr();
   2.457 us [188685] | getenv();
   0.187 us [188685] | strcmp();
   0.150 us [188685] | strcmp();
   0.220 us [188685] | strchr();
   1.513 us [188685] | malloc();
   1.507 us [188685] | memcpy();
   0.560 us [188685] | getenv();
   0.423 us [188685] | getenv();
   0.296 us [188685] | getenv();
   0.330 us [188685] | getenv();
   1.200 us [188685] | fileno();
   3.964 us [188685] | isatty();
   0.480 us [188685] | getenv();
   0.434 us [188685] | getenv();
   0.317 us [188685] | getenv();
   0.370 us [188685] | getenv();
   0.317 us [188685] | getenv();
        ...

So I need to use --no-libcall to see the same result.

$ uftrace --no-libcall --python hello.py
1
# DURATION     TID     FUNCTION
            [188643] | <module>() {
            [188643] |   a() {
   2.354 us [188643] |     b();
   7.917 us [188643] |   } /* a */
  28.287 us [188643] | } /* <module> */
  12.330 us [188643] | _remove();
   5.257 us [188643] | _remove();

namhyung · 2018-08-09T02:56:59Z

Yes, that's why I added -F <module> -t 1us filters. But --no-libcall almost removes values of using uftrace for python IMO.

DanielTimLee · 2018-10-08T13:58:52Z

Just an idea,
Are there any chances to make use of cProfile which is python's native profiler?

Well, it might be unpleasant having a dependency on an external module,
but using a native profiler might be having fewer problems with tracing, I guess.

namhyung · 2018-10-10T04:14:40Z

I don't know about python profiling internals. But if there's a better way for uftrace, why not? :)

claudiajkang · 2018-10-14T07:15:47Z

I tested test.py in my environment and I found two issues.

First, when make, if kernel-debug-devel package is not installed, the error occurred.

uftrace detected system features:
...         prefix: /usr/local
...         libelf: [ on  ] - more flexible ELF data handling
...          libdw: [ on  ] - DWARF debug info support
...   libpython2.7: [ on  ] - python scripting support
...    libncursesw: [ on  ] - TUI support
...   cxa_demangle: [ on  ] - full demangler support with libstdc++

......

  GEN      version.h
  CC       uftrace.o
  LINK     uftrace
/home/claudia/uftrace/cmds/info.o: In function `fill_exe_build_id':
/home/claudia/uftrace/cmds/info.c:98: undefined reference to `elf_getshdrstrndx'
/home/claudia/uftrace/cmds/info.c:98: undefined reference to `elf_nextscn'
/home/claudia/uftrace/cmds/info.c:98: undefined reference to `elf_nextscn'
/home/claudia/uftrace/cmds/info.c:98: undefined reference to `gelf_getshdr'
/home/claudia/uftrace/cmds/info.c:105: undefined reference to `elf_strptr'
/home/claudia/uftrace/cmds/info.c:117: undefined reference to `elf_getdata'
/home/claudia/uftrace/cmds/info.c:117: undefined reference to `gelf_getnote'
/home/claudia/uftrace/cmds/tui.o: In function `tui_window_move_down':
/home/claudia/uftrace/cmds/tui.c:1785: undefined reference to `LINES'
/home/claudia/uftrace/cmds/tui.c:1785: undefined reference to `LINES'
/home/claudia/uftrace/cmds/tui.o: In function `win_header_report':
/home/claudia/uftrace/cmds/tui.c:1362: undefined reference to `printw'
/home/claudia/uftrace/cmds/tui.c:1363: undefined reference to `COLS'
/home/claudia/uftrace/cmds/tui.o: In function `win_display_info':
/home/claudia/uftrace/cmds/tui.c:1557: undefined reference to `COLS'
/home/claudia/uftrace/cmds/tui.o: In function `win_header_info':
/home/claudia/uftrace/cmds/tui.c:1514: undefined reference to `COLS'
/home/claudia/uftrace/cmds/tui.o: In function `win_header_session':
/home/claudia/uftrace/cmds/tui.c:1621: undefined reference to `COLS'
....
collect2: error: ld returned 1 exit status
make: *** [/home/claudia/uftrace/uftrace] 오류 1

Second, when I ran test.py from uftrace, the error occurred because of elf format issue.

$ uftrace --python ../test.py 
WARN: Segmentation fault
uftrace: /home/claudia/uftrace/cmds/record.c:387:fill_file_header
 ERROR: cannot open info file: No such file or directory
child terminated by signal: 11: Segmentation fault

I think that the reason is [vdso] section in test.py's process map.

(gdb) !cat proc/<pid>/maps
...
7ffe75d87000-7ffe75d89000 r-xp 00000000 00:00 0                          [vdso]

So I suggest that ELF parsing logic in this PR needs to be updated.

DanielTimLee · 2018-10-14T08:30:49Z

(+ It keeps search uftrace python module from the compiled directory.)

ex)

# trying /home/manjaro/git/uf2/misc/uftrace.so
# trying /home/manjaro/git/uf2/misc/uftracemodule.so
# trying /home/manjaro/git/uf2/misc/uftrace.py
# trying trace_python.so
# trying trace_pythonmodule.so
# trying trace_python.py
# trying trace_python.pyc
# trying /home/manjaro/git/uf2/misc/trace_python.so
dlopen("/home/manjaro/git/uf2/misc/trace_python.so", 2);
import trace_python # dynamically loaded from /home/manjaro/git/uf2/misc/trace_python.so

which means, When removing the compiled directory, it won't work at all.

DanielTimLee · 2018-10-14T10:18:20Z

(+ Due to python's module import logic, Maybe the trace result of import should be hidden with options.)

Example code

from sqlalchemy import create_engine
engine = create_engine('sqlite:///:memory:', echo=True)

This example imports sqlalchemy ORM and provisions SQLite memory DB.

$ uftrace --python test.py
# DURATION     TID     FUNCTION
            [ 31701] | <module>() {
            [ 31701] |   <module>() {
            [ 31701] |     <module>() {
            [ 31701] |       <module>() {
            [ 31701] |         <module>() {
            [ 31701] |           <module>() {
  25.046 us [ 31701] |             <module>();
 764.568 us [ 31701] |             <module>();
  21.142 us [ 31701] |             OrderedDict();
   9.359 us [ 31701] |             Counter();
   2.261 ms [ 31701] |           } /* <module> */
            [ 31701] |           <module>() {
            [ 31701] |             <module>() {
            [ 31701] |               <module>() {
 356.349 us [ 31701] |                 <module>();
   2.810 us [ 31701] |                 GeneratorContextManager();
            [ 31701] |                 contextmanager() {
   4.267 us [ 31701] |                   wraps();
  18.103 us [ 31701] |                   update_wrapper();
  33.232 us [ 31701] |                 } /* contextmanager */
   2.664 us [ 31701] |                 closing();
 563.707 us [ 31701] |               } /* <module> */
            [ 31701] |               <module>() {
            [ 31701] |                 filterwarnings() {
            [ 31701] |                   compile() {
            [ 31701] |                     _compile() {
   3.774 us [ 31701] |                       isstring();
            [ 31701] |                       compile() {
   2.633 us [ 31701] |                         isstring();
            [ 31701] |                         parse() {
            [ 31701] |                           __init__() {
   7.184 us [ 31701] |                             __next();
  12.869 us [ 31701] |                           } /* __init__ */
   3.479 us [ 31701] |                           __init__();
            [ 31701] |                           _parse_sub() {
            [ 31701] |                             _parse() {
   3.547 us [ 31701] |                               __init__();
            [ 31701] |                               get() {
   5.965 us [ 31701] |                                 __next();
  10.091 us [ 31701] |                               } /* get */
   2.023 us [ 31701] |                               append();
            [ 31701] |                               get() {
   4.505 us [ 31701] |                                 __next();
   7.912 us [ 31701] |                               } /* get */
   1.372 us [ 31701] |                               append();
            [ 31701] |                               get() {
   4.787 us [ 31701] |                                 __next();
   7.777 us [ 31701] |                               } /* get */
   1.154 us [ 31701] |                               append();

$ uftrace -D 2 --python test.py
# DURATION     TID     FUNCTION
            [  4991] | <module>() {
 179.633 ms [  4991] |   <module>();
  12.627 ms [  4991] |   create_engine();
 192.395 ms [  4991] | } /* <module> */
            [  4991] | _exitfunc() {
  14.307 us [  4991] |   __stop();
  13.025 us [  4991] |   _pickSomeNonDaemonThread();
   0.967 us [  4991] |   _note();
   1.976 us [  4991] |   __delete();
  37.923 us [  4991] | } /* _exitfunc */
            [  4991] | _run_exitfuncs() {
  47.101 us [  4991] |   shutdown();
  53.613 us [  4991] | } /* _run_exitfuncs */
   5.729 us [  4991] | _remove();
   1.544 us [  4991] | _remove();

Most of the trace results are module import logic, and I think import logic looks aren't that necessary by default with tracing.

(Ps. tracing should provide much more info than <module>, how about showing with module path?)

namhyung · 2018-12-16T15:25:06Z

Pushed review/python-tracing-v2

Changelog:

fix segfault due to broken debug info setup
enable debug logging
get module name for <module> functions

$ uftrace --python test2.py | head
# DURATION     TID     FUNCTION
            [  8982] | <module:__main__>() {
            [  8982] |   <module:sqlalchemy>() {
            [  8982] |     <module:sqlalchemy.sql>() {
            [  8982] |       <module:sqlalchemy.sql.expression>() {
            [  8982] |         <module:sqlalchemy.sql.visitors>() {
            [  8982] |           <module:collections>() {
  13.548 us [  8982] |             <module:keyword>();
 600.704 us [  8982] |             <module:heapq>();
  14.237 us [  8982] |             OrderedDict();

honggyukim · 2018-12-16T23:42:28Z

Thanks for the update. I just tested it with a python version of t-fork, but it needs to fix the depth for os.fork.

$ uftrace --python --no-libcall p-fork.py
# DURATION     TID     FUNCTION
            [148263] | <module:__main__>() {
            [148263] |   main() {
            [148263] |     /* linux:sched-out */
            [148274] | a() {
            [148274] |   b() {
  45.707 us [148274] |     c();
  69.408 us [148274] |   } /* b */
 106.975 us [148274] | } /* a */
 109.289 us [148274] | } /* main */
 117.109 us [148274] | } /* <module:__main__> */
  12.327 us [148274] | _remove();
   5.727 us [148274] | _remove();
   7.730 ms [148263] |     /* linux:sched-in */
            [148263] |     a() {
            [148263] |       b() {
  10.984 us [148263] |         c();
  24.301 us [148263] |       } /* b */
  34.991 us [148263] |     } /* a */
   8.435 ms [148263] |   } /* main */
   8.456 ms [148263] | } /* <module:__main__> */
  12.163 us [148263] | _remove();
   5.090 us [148263] | _remove();

Here is the test program, p-fork.py.

#!/usr/bin/env python

import os

def c():
    return os.getpid() % 100000

def b():
    return c() + 1

def a():
    return b() - 1

def main():
    ret = 0
    pid = os.fork()
    if pid:
        os.wait()

    ret += a()

    return ret

if __name__=='__main__':
    main()

quark-zju · 2018-12-29T21:45:34Z

Regarding on cProfile, it uses the same PyEval_SetProfile API under the hood.

One disadvantage of cProfile that does not exist in uftrace is, cProfile prints a flat list of functions, and it's hard to see the relationship between functions. For example, if one code path calls read(1) and another code path calls calls read(1e10), it's hard to find the slow code path just from the cProfile output. I also had a quick-and-dirty profiler to workaround cProfile limitation.

(EDITED: Removed unrelated content)

quark-zju · 2018-12-29T22:15:46Z

Some comments on review/python-tracing-v2. Mostly about reducing overhead:

Use PyEval_SetProfile to reduce overhead (and avoid PyObject_GetAttrString accordingly by accessing Python structs directly like ((PyFrameObject *)frame)->f_code).
Use the address of PyCodeObject as the function address. I think this is what cProfile uses and I have been using the same approach in traceprof (mentioned above) - it worked pretty well and is much faster than manually maintaining the function address mapping. You'll need to Py_INCREF the code object to keep them alive.
Since libpython is already a required dependency, consider "inline" the Python extension, by calling Py_Initialize, followed by PyEval_SetProfile, then PyRun_SimpleFile to run a script.

namhyung · 2018-12-30T02:13:09Z

@quark-zju thanks for detailed comments, I'll take a look at them to reduce overheads. Note that libpython is not a required dependency of uftrace (and I don't want to make it is) and we load it with dlopen() if needed. But it's a different situaltion so ok to use PyEval_SetProfile() IMO.

namhyung · 2018-12-30T02:15:19Z

@honggyukim yes it doesn't consider multi-process fully yet - but I guess it's mostly for works during replay.

quark-zju · 2018-12-30T06:28:48Z

Thanks! Making libpython optional was a wise decision.

It seems the next missing thing would be exception handling. It can probably be done by following f_back from PyFrameObject recursively to keep track the Python stack and emulate those missed "exit"s like mcount_rstack_reset_exception.

namhyung · 2018-12-31T01:05:37Z

Probably, but is there a way for python interpreter to let us know about the exception info?

quark-zju · 2018-12-31T02:17:26Z

Actually, I was wrong. According to https://github.com/python/cpython/blob/62be74290aca26d16f3f55ece7ff6dad14e60e8d/Modules/_lsprof.c#L454, PyTrace_RETURN will be generated, so there is no need to specially handling exceptions. But arg should also provide the exception information in the PyTrace_EXCEPTION case. The profiler I wrote was a quick-and-dirty work and I didn't pay close attention to exception handling.

honggyukim · 2019-02-13T00:57:24Z

I was trying to test it for OE bitbake script, but it doesn't accept additional arguments for python script.

Here is a test script.

$ cat arg-test.py
#!/usr/bin/env python
import sys
print(sys.argv)

$ ./arg-test.py
['./arg-test.py']

$ ./arg-test.py --arg1
['./arg-test.py', '--arg1']

$ ./arg-test.py --arg1 --arg2
['./arg-test.py', '--arg1', '--arg2']

But the result is not same as original execution.

$ uftrace record --no-libcall --python ./arg-test.py
['/home/honggyu/work/uftrace/misc/uftrace.py', './arg-test.py']

$ uftrace record --no-libcall --python ./arg-test.py --arg1
['/home/honggyu/work/uftrace/misc/uftrace.py', './arg-test.py']

$ uftrace record --no-libcall --python ./arg-test.py --arg1 --arg2
['/home/honggyu/work/uftrace/misc/uftrace.py', './arg-test.py']

The result doesn't show --arg1 and --arg2 in the output.

namhyung · 2019-02-16T03:15:21Z

Pushed review/python-tracing-v3

Changelog:

handle command line argument

It was easy thanks to recent script file support but I think we need to consider supporting python3 as well.

$ uftrace record --python ./arg-test.py 1 2 3
['/home/namhyung/project/uftrace/misc/uftrace.py', './arg-test.py', '1', '2', '3']

honggyukim · 2019-02-17T04:53:29Z

Thanks for the update! I just updated to make both python2 and python3 work based on review/python-tracing-v3.

I pushed it into honggyukim/review/python3-tracing-v1.

Here are test scripts.

$ cat p-abc.py
#!/usr/bin/env python
import os
def c():
    return os.getpid()
def b():
    return c()
def a():
    return b()
print(a())

The below script is same as above, but just uses python3.

$ cat p-abc3.py
#!/usr/bin/env python3
import os
def c():
    return os.getpid()
def b():
    return c()
def a():
    return b()
print(a())

It works both python2 and python3 scripts as follows.

$ uftrace --no-libcall --python p-abc.py
171054
# DURATION     TID     FUNCTION
            [171054] | <module:__main__>() {
            [171054] |   a() {
            [171054] |     b() {
   3.380 us [171054] |       c();
   7.413 us [171054] |     } /* b */
  10.427 us [171054] |   } /* a */
  49.587 us [171054] | } /* <module:__main__> */
  10.977 us [171054] | _remove();
   4.677 us [171054] | _remove();

$ uftrace --no-libcall --python p-abc3.py
171068
# DURATION     TID     FUNCTION
  13.120 us [171068] | getpreferredencoding();
            [171068] | __init__() {
   3.143 us [171068] |   __init__();
  11.880 us [171068] | } /* __init__ */
  15.984 us [171068] | decode();
            [171068] | <module:__main__>() {
            [171068] |   a() {
            [171068] |     b() {
   3.194 us [171068] |       c();
   8.030 us [171068] |     } /* b */
  12.003 us [171068] |   } /* a */
  60.443 us [171068] | } /* <module:__main__> */

Please take a look. Thanks!

honggyukim · 2019-02-17T04:58:43Z

@chbae I also tested it for poky bitbake script, which is written in python3 and the tracing result looks fine although it needs to fix some symbol names.

$ uftrace record --no-libcall --python ./bitbake/bin/bitbake zip-native
   ...
$ uftrace replay -t 2s
# DURATION     TID     FUNCTION
            [171177] | <module:__main__>() {
            [171177] |   _find_and_load() {
            [171177] |     _find_and_load_unlocked() {
            [171177] |       _load_unlocked() {
            [171177] |         exec_module() {
            [171177] |           _call_with_frames_removed() {
            [171177] |             <module:bb>() {
            [171177] |               _handle_fromlist() {
            [171177] |                 _call_with_frames_removed() {
            [171177] |                   _find_and_load() {
            [171177] |                     _find_and_load_unlocked() {
            [171177] |                       _load_unlocked() {
            [171177] |                         exec_module() {
            [171177] |                           _call_with_frames_removed() {
   2.526  s [171177] |                             <module:bb.fetch2>();
   2.526  s [171177] |                           } /* _call_with_frames_removed */
   2.527  s [171177] |                         } /* exec_module */
   2.527  s [171177] |                       } /* _load_unlocked */
   2.528  s [171177] |                     } /* _find_and_load_unlocked */
   2.528  s [171177] |                   } /* _find_and_load */
   2.528  s [171177] |                 } /* _call_with_frames_removed */
   2.528  s [171177] |               } /* _handle_fromlist */
   2.757  s [171177] |             } /* <module:bb> */
   2.757  s [171177] |           } /* _call_with_frames_removed */
   2.757  s [171177] |         } /* exec_module */
   2.757  s [171177] |       } /* _load_unlocked */
   2.758  s [171177] |     } /* _find_and_load_unlocked */
   2.758  s [171177] |   } /* _find_and_load */
            [171177] |   bitbake_main() {
            [171177] |     setup_bitbake() {
   2.004  s [171177] |       __init__();
            [171197] | <6e9>() {
            [171197] |   <6bb>() {
            [171197] |     <6ed>() {
            [171197] |       /* linux:task-name (name=Cooker) */
   2.040  s [171177] |     } /* setup_bitbake */
            [171177] |     main() {
            [171177] |       /* linux:task-name (name=KnottyUI) */
            [171197] |       <6f4>() {
            [171197] |         <721>() {
            [171197] |           <722>() {
            [171197] |             <723>() {
   2.592  s [171197] |               __init__();
   3.065  s [171197] |             } /* <723> */
   3.066  s [171197] |           } /* <722> */
   3.066  s [171197] |         } /* <721> */
   3.068  s [171197] |       } /* <6f4> */
   7.695  s [171177] |     } /* main */
   9.735  s [171177] |   } /* bitbake_main */
  12.691  s [171177] | } /* <module:__main__> */
   7.847  s [171197] |     } /* <6ed> */
   7.849  s [171197] |   } /* <6bb> */
   7.850  s [171197] | } /* <6e9> */

honggyukim · 2019-02-18T05:42:08Z

It'd be really great if we can also add source location info for python functions.

honggyukim · 2019-02-23T00:38:25Z

@namhyung I just created a PR at #680 maybe for easier review. You don't have to make comment on the PR, but might be simpler to leave inline review comments.

honggyukim · 2019-05-08T10:45:17Z

It seems that we can also provide Python C API functions as built-in functions that can be processed by --auto-args. At least, const char* typed functions can be seen a lot better than no args.

honggyukim · 2023-02-05T13:13:57Z

Thanks. I don't see the problem WARN: call depth beyond 1024 is not recorded with v11. The nesting depth looks good now.

defer sys.setprofile() to skip unrelated functions

It looks good but can we add an option not to defer sys.setprofile() just like we have --kernel-full.

I still see difference when running pip install pre-commit without giving full path of pip.

$ uftrace record pip install pre-commit
pip: can't open file '/home/honggyu/work/uftrace/git/python/install': [Errno 2] No such file or directory

I still need to give a full path with v11 as follows.

$ uftrace record `which pip` install pre-commit
Defaulting to user installation because normal site-packages is not writeable
Requirement already satisfied: pre-commit in /home/honggyu/.local/lib/python3.10/site-packages (2.21.0)
Requirement already satisfied: cfgv>=2.0.0 in /home/honggyu/.local/lib/python3.10/site-packages (from pre-commit) (3.3.1)
Requirement already satisfied: identify>=1.0.0 in /home/honggyu/.local/lib/python3.10/site-packages (from pre-commit) (2.5.11)
Requirement already satisfied: pyyaml>=5.1 in /usr/lib/python3/dist-packages (from pre-commit) (5.4.1)
Requirement already satisfied: virtualenv>=20.10.0 in /home/honggyu/.local/lib/python3.10/site-packages (from pre-commit) (20.17.1)
Requirement already satisfied: nodeenv>=0.11.1 in /home/honggyu/.local/lib/python3.10/site-packages (from pre-commit) (1.7.0)
Requirement already satisfied: setuptools in /usr/lib/python3/dist-packages (from nodeenv>=0.11.1->pre-commit) (59.6.0)
Requirement already satisfied: filelock<4,>=3.4.1 in /home/honggyu/.local/lib/python3.10/site-packages (from virtualenv>=20.10.0->pre-commit) (3.9.0)
Requirement already satisfied: distlib<1,>=0.3.6 in /home/honggyu/.local/lib/python3.10/site-packages (from virtualenv>=20.10.0->pre-commit) (0.3.6)
Requirement already satisfied: platformdirs<3,>=2.4 in /home/honggyu/.local/lib/python3.10/site-packages (from virtualenv>=20.10.0->pre-commit) (2.6.2)

honggyukim · 2023-02-05T13:16:29Z

In addition, it would be really great if we can provide source location info for python code as well. Then we can jump to the actual python source code in tui command. Would it be possible?

namhyung · 2023-02-06T02:28:53Z

Yep, I think it's doable. I'll leave it as the next step.

namhyung · 2023-02-06T02:30:48Z

It looks good but can we add an option not to defer sys.setprofile() just like we have --kernel-full.

I'm not sure about it. Anyway what it shows is the implementation of our loader script uftrace.py, not the general python startup code.

namhyung · 2023-02-06T08:40:02Z

I still see difference when running pip install pre-commit without giving full path of pip.

Ok, I found a problem dealing with script files. It should check PATH env to get the full pathname when accessing the file.

namhyung · 2023-02-06T09:05:07Z

Pushed review/python-tracing-v12

Changes:

handle pathname of script using PATH environ variable
do not skip the first frame for Python2

Now it can run pip script without the full pathname!!

$ uftrace record pip install pre-commit
Defaulting to user installation because normal site-packages is not writeable
Requirement already satisfied: pre-commit in /usr/lib/python3/dist-packages (2.20.0)

honggyukim · 2023-02-06T14:08:47Z

Pushed review/python-tracing-v12

It looks great. Thanks.

I'm not sure about it. Anyway what it shows is the implementation of our loader script uftrace.py, not the general python startup code.

Then it's okay to skip it.

One more thing is that we added trace-python.c and uftrace.py in misc directory, but I feel we're adding too many different stuffs under misc. How about create a new directory python for python tracing?

honggyukim · 2023-02-06T14:16:52Z

One more thing is if #1605 is merged before this, then we should use uftrace_shm_open instead of direct shm_open. Likewise, shm_unlink has to be changed as well.

namhyung · 2023-02-07T06:58:17Z

One more thing is that we added trace-python.c and uftrace.py in misc directory, but I feel we're adding too many different stuffs under misc. How about create a new directory python for python tracing?

Good idea. Will move.

One more thing is if #1605 is merged before this, then we should use uftrace_shm_open instead of direct shm_open. Likewise, shm_unlink has to be changed as well.

Yep, I plan to merge it before this work. I'll take care of the change.

honggyukim · 2023-02-07T23:54:06Z

It works fine with pre-commit run -a as follows.

$ uftrace record pre-commit run -a
Check Yaml...............................................................Passed
Fix End of Files.........................................................Passed
Trim Trailing Whitespace.................................................Passed
isort....................................................................Passed
clang-format.............................................................Passed
codespell................................................................Passed

If I run uftrace tui, it shows multiple graph sessions.

Key uftrace command
 G  call Graph for session #1: python3.10
    call Graph for session #2: python3.10
    call Graph for session #3: python3.10
    call Graph for session #4: python3.10
    call Graph for session #5: python3.10
    call Graph for session #6: python3.10
    call Graph for session #7: python3.10
    call Graph for session #8: python3.10
    call Graph for session #9: python3.10
    call Graph for session #10: python3.10
    call Graph for session #11: python3.10
    call Graph for session #12: python3.10
    call Graph for session #13: python3.10
    call Graph for session #14: python3.10
    call Graph for session #15: python3.10
    call Graph for session #16: python3.10
    call Graph for session #17: python3.10
    call Graph for session #18: python3.10
    call Graph for session #19: python3.10
    call Graph for session #20: python3.10
    call Graph for session #21: python3.10
    call Graph for session #22: python3.10
 R  Report functions
 I  uftrace Info
 h  Help message
 q  quit

But most of the graph sessions are empty except for the first session. I'm just wondering if it correctly follows the underline python scripts that run each pre-commit hooks.

honggyukim · 2023-02-07T23:58:19Z

Another test with b4 works fine.

$ uftrace record b4 am https://lore.kernel.org/all/20221219201732.460111-7-namhyung@kernel.org/
Analyzing 10 messages in the thread                                         
Checking attestation on all messages, may take a moment...
---                                                                         
  ✓ [PATCH 1/6] perf lock contention: Factor out lock_type_table
  ✓ [PATCH 2/6] perf lock contention: Add -Y/--type-filter option
    + Acked-by: Namhyung Kim <namhyung@kernel.org>                  
  ✓ [PATCH 3/6] perf lock contention: Support lock type filtering for BPF
  ✓ [PATCH 4/6] perf lock contention: Add -L/--lock-filter option
  ✓ [PATCH 5/6] perf lock contention: Support lock addr/name filtering for BPF
  ✓ [PATCH 6/6] perf test: Update perf lock contention test       
  ---                                                                       
  ✓ Signed: DKIM/gmail.com (From: namhyung@kernel.org)         
  ---                                                                       
  NOTE: install patatt for end-to-end signature verification
---                                  
Total patches: 6  
---                                                                         
Cover: ./20221219_namhyung_perf_lock_contention_add_more_filter_options_v1.cover
 Link: https://lore.kernel.org/r/20221219201732.460111-1-namhyung@kernel.org 
 Base: 51c4f2bf5397b34b79a6712221606e0ab2e6f7ed
       git checkout -b 20221219_namhyung_kernel_org 51c4f2bf5397b34b79a6712221606e0ab2e6f7ed
       git am ./20221219_namhyung_perf_lock_contention_add_more_filter_options_v1.mbx

$ uftrace tui -t 10ms
  TOTAL TIME : FUNCTION
  838.369 ms : (1) python3.10
  838.369 ms : (1) __main__.<module>
   67.895 ms :  ├▶(1) importlib._bootstrap._find_and_load
             :  │
  344.847 ms :  ├▶(1) importlib_load_entry_point
             :  │
  418.234 ms :  └─(1) b4.command.cmd
  394.371 ms :    (1) b4.command.cmd_am
  388.535 ms :    (1) b4.mbox.main
   70.639 ms :     ├▶(1) b4.mbox.get_msgs
             :     │
  317.877 ms :     └─(1) b4.mbox.make_am
   25.043 ms :        ├▶(2) b4.add_message
             :        │
  214.915 ms :        ├─(1) b4.get_am_ready
  121.689 ms :        │  ├─(6) b4.get_attestation_trailers
  121.118 ms :        │  │ (6) b4.attestors
  119.869 ms :        │  │ (6) b4._load_dkim_attestors
   70.559 ms :        │  │  ├─(6) email.message.as_bytes
   70.481 ms :        │  │  │ (6) email.generator.flatten
   70.407 ms :        │  │  │ (6) email.generator._write
   40.920 ms :        │  │  │ (4) email.generator._write_headers
             :        │  │  │
   13.018 ms :        │  │  └▶(1) dkim.verify
             :        │  │
   83.517 ms :        │  └▶(6) b4.get_am_message
             :        │
   18.807 ms :        ├─(1) b4.save_git_am_mbox
             :        │
   13.855 ms :        └▶(1) b4.save_cover

namhyung · 2023-02-08T02:08:00Z

I'm just wondering if it correctly follows the underline python scripts that run each pre-commit hooks.

Probably not. It needs to add -m uftrace_python option to the Python interpreter (if it's called from a command line). We do it in the record command for a script we invoke. But I'm not sure how to handle if it calls other external scripts.

honggyukim · 2023-02-08T12:28:10Z

Probably not. It needs to add -m uftrace_python option to the Python interpreter (if it's called from a command line). We do it in the record command for a script we invoke. But I'm not sure how to handle if it calls other external scripts.

It would be useful if there is an environment variable to add a module by default but I don't have an idea how to handle it.

I see another bugs when running runtest.py for multiple tests as follows.

$ uftrace record ./runtest.py 00[1-5]
Start 5 tests with 5 worker
Compiler                  gcc                                           clang                                       
Test case                 pg             finstrument-fu fpatchable-fun  pg             finstrument-fu fpatchable-fun
------------------------: O0 O1 O2 O3 Os O0 O1 O2 O3 Os O0 O1 O2 O3 Os  O0 O1 O2 O3 Os O0 O1 O2 O3 Os O0 O1 O2 O3 Os
001 basic               : OK OK OK OK OK OK OK OK OK OK SG SG SG SG SG  OK OK OK OK OK OK OK OK OK OK SG SG SG SG SG
002 argument            : OK OK OK OK OK OK OK OK OK OK SG SG SG SG SG  OK OK OK OK OK OK OK OK OK OK SG SG SG SG SG
003 thread              : OK OK OK OK OK OK OK OK OK OK SG SG SG SG SG  OK OK OK OK OK OK OK OK OK OK SG SG SG SG SG
004 filter_F            : NG NG NG NG NG NG NG NG NG NG SG SG SG SG SG  NG NG NG NG NG NG NG NG NG NG SG SG SG SG SG
005 filter_N            : NG NG NG NG NG NG NG NG NG NG SG SG SG SG SG  NG NG NG NG NG NG NG NG NG NG SG SG SG SG SG


runtime test stats
====================
total   150  Tests executed (success: 40.00%)
  OK:    60  Test succeeded
  OK:     0  Test succeeded (with some fixup)
  NG:    40  Different test result
  NZ:     0  Non-zero return value
  SG:    50  Abnormal exit by signal
  TM:     0  Test ran too long
  BI:     0  Build failed
  LA:     0  Unsupported Language
  SK:     0  Skipped
WARN: Segmentation fault: address not mapped (addr: 0x7fd5252c8154)
WARN:  if this happens only with uftrace, please consider -e/--estimate-return option.

WARN: Backtrace from uftrace v0.13.1-23-gb8b4 ( x86_64 dwarf python3 luajit tui perf sched dynamic )
WARN: =====================================
WARN: [14] (<2ca9>[2ca9] <= <0>[0])
WARN: [13] (<2d6b>[2d6b] <= <0>[0])
WARN: [12] (<318f>[318f] <= <0>[0])
WARN: [11] (<3189>[3189] <= <0>[0])
WARN: [10] (<3188>[3188] <= <0>[0])
WARN: [9] (<30fc>[30fc] <= <0>[0])
WARN: [8] (<2c9d>[2c9d] <= <0>[0])
WARN: [7] (<2c99>[2c99] <= <0>[0])
WARN: [6] (<2c6a>[2c6a] <= <0>[0])
WARN: [5] (<2c66>[2c66] <= <0>[0])
WARN: [4] (<2e74>[2e74] <= <0>[0])
WARN: [3] (<2e73>[2e73] <= <0>[0])
WARN: [2] (<2de3>[2de3] <= <0>[0])
WARN: [1] (<2de1>[2de1] <= <0>[0])
WARN: [0] (<1>[1] <= <0>[0])

Please report this bug to https://github.com/namhyung/uftrace/issues.

It didn't see the problem when running 1 or 2 tests but increasing the number of tests makes this problem.

honggyukim · 2023-02-09T23:11:19Z

And the following test shows NG.

$ uftrace record ./runtest.py -p -c gcc -O0 004
Start 1 tests without worker pool
Compiler                  gc
Test case                 pg
------------------------: O0
004 filter_F            : NG

namhyung · 2023-02-10T06:55:35Z

The runtest can be tricky since it runs uftrace under uftrace. The options are passed by env so it can be affected by outer uftrace.

honggyukim · 2023-02-13T21:48:33Z

OK. Then please ignore the result of runtest.py.

namhyung · 2023-02-23T08:33:23Z

Pushed review/python-tracing-v13

Changes:

move the python support code to python/ directory
use the new shmem APIs
add a unit test case

honggyukim · 2023-02-25T07:20:10Z

Thanks for the update. It looks good but it would be better if we can run some python feature tests.

It looks cpython internally has some tests and one of the test shows some failures when recording with uftrace.

$ git clone https://github.com/python/cpython.git
$ cd cpython/Lib/test

And apply the following change to make it running by uftrace.

diff --git a/Lib/test/test_array.py b/Lib/test/test_array.py
index 5b2c107a60..eb080d1e06 100755
--- a/Lib/test/test_array.py
+++ b/Lib/test/test_array.py
@@ -1,3 +1,4 @@
+#!/usr/bin/env python
 """Test the arraymodule.
    Roger E. Masse
 """

Then the test works fine showing OK at the end.

$ ./test_array.py
.....................................................................................................................................................................................
.....................................................................................................................................................................................
.....................................................................................................................................................................................
.....................................................................................................................................................................................
.........................................................................................
----------------------------------------------------------------------
Ran 813 tests in 0.242s

OK

I tried to run it with uftrace and it shows some failures at the end of execution as follows.

$ uftrace record ./test_array.py
...............................................................F............................................................F........................................................
...F............................................................F..........................................................................F.........................................
...................F.................................................................F......................................................F........................................
.....................F.............................................................F.............................................................F...................................
..........................F.............................................................F
======================================================================
FAIL: test_weakref (__main__.ByteTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "<string>", line 1103, in test_weakref
AssertionError: ReferenceError not raised by len

======================================================================
FAIL: test_weakref (__main__.DoubleTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "<string>", line 1103, in test_weakref
AssertionError: ReferenceError not raised by len

======================================================================
FAIL: test_weakref (__main__.FloatTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "<string>", line 1103, in test_weakref
AssertionError: ReferenceError not raised by len
    ...
======================================================================
FAIL: test_weakref (__main__.UnsignedShortTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "<string>", line 1103, in test_weakref
AssertionError: ReferenceError not raised by len

----------------------------------------------------------------------
Ran 813 tests in 1.333s

FAILED (failures=13)

honggyukim · 2023-02-25T07:32:37Z

I see a segfault this time in cpython test.

But this test only works fine when I checkout cpython repo to the same version of my running python v3.10.6, then applied the following change.

diff --git a/Lib/test/test_uuid.py b/Lib/test/test_uuid.py
index d6a8333427..11ee1e9c6d 100755
--- a/Lib/test/test_uuid.py
+++ b/Lib/test/test_uuid.py
@@ -1,3 +1,4 @@
+#!/usr/bin/env python
 import unittest
 from test import support
 from test.support import import_helper

The test runs fine.

$ ./test_uuid.py
s...s.ss....ss...s.ss............................s.sss.s....
----------------------------------------------------------------------
Ran 60 tests in 0.037s

OK (skipped=14)

But it gets crashed as follows when running with uftrace record.

$ uftrace record ./test_uuid.py
s...s.ss....ss...s.ss............................s.sss.s....
----------------------------------------------------------------------
Ran 60 tests in 0.236s

OK (skipped=14)
WARN: child terminated by signal: 11: Segmentation fault

namhyung · 2023-02-25T16:51:28Z

Some reference count errors are expected as it increases refcnt of code object for the symbol table. I guess it won't be a problem in practice unless the python code is extremely sensitive to the refcnt behavior.

namhyung · 2023-02-26T06:17:45Z

Let's talk about them in separate issues.

namhyung added the review label Aug 8, 2018

namhyung added the enhancement label Aug 27, 2018

claudiajkang mentioned this issue Oct 25, 2018

[18' 강주희] 최종보고서 kosslab-kr/uftrace#83

Open

honggyukim mentioned this issue Feb 23, 2019

python function tracing for both python2 and python3 #680

Closed

honggyukim added the language support label May 7, 2019

DanielTimLee mentioned this issue Sep 18, 2019

컨트리뷰션 시작을 위한 레퍼런스 kosslab-kr/uftrace#89

Open

namhyung changed the title ~~PoC: python function tracing support~~ python function tracing support Feb 26, 2023

namhyung closed this as completed in 0c4ce3e Feb 26, 2023

namhyung unpinned this issue Feb 26, 2023

This was referenced Feb 26, 2023

uftrace cannot record when .py script doesn't have shebang #1639

Open

python tracing result with source location info #1640

Closed

namhyung modified the milestones: v0.13, v0.14 Jun 11, 2023

honggyukim mentioned this issue Jul 23, 2023

uftrace 주요 개발 사항 링크들 kosslab-kr/uftrace#107

Open

python function tracing support #436

python function tracing support #436

Comments

namhyung commented Aug 8, 2018

namhyung commented Aug 9, 2018

honggyukim commented Aug 9, 2018

namhyung commented Aug 9, 2018

DanielTimLee commented Oct 8, 2018 • edited Loading

namhyung commented Oct 10, 2018

claudiajkang commented Oct 14, 2018 • edited Loading

DanielTimLee commented Oct 14, 2018 • edited Loading

DanielTimLee commented Oct 14, 2018

namhyung commented Dec 16, 2018 • edited Loading

honggyukim commented Dec 16, 2018

quark-zju commented Dec 29, 2018 • edited Loading

quark-zju commented Dec 29, 2018 • edited Loading

namhyung commented Dec 30, 2018

namhyung commented Dec 30, 2018

quark-zju commented Dec 30, 2018

namhyung commented Dec 31, 2018

quark-zju commented Dec 31, 2018 • edited Loading

honggyukim commented Feb 13, 2019 • edited Loading

namhyung commented Feb 16, 2019

honggyukim commented Feb 17, 2019 • edited Loading

honggyukim commented Feb 17, 2019 • edited Loading

honggyukim commented Feb 18, 2019

honggyukim commented Feb 23, 2019 • edited Loading

honggyukim commented May 8, 2019

honggyukim commented Feb 5, 2023

honggyukim commented Feb 5, 2023

namhyung commented Feb 6, 2023

namhyung commented Feb 6, 2023

namhyung commented Feb 6, 2023

namhyung commented Feb 6, 2023

honggyukim commented Feb 6, 2023

honggyukim commented Feb 6, 2023

namhyung commented Feb 7, 2023

honggyukim commented Feb 7, 2023

honggyukim commented Feb 7, 2023 • edited Loading

namhyung commented Feb 8, 2023

honggyukim commented Feb 8, 2023

honggyukim commented Feb 9, 2023

namhyung commented Feb 10, 2023

honggyukim commented Feb 13, 2023

namhyung commented Feb 23, 2023

honggyukim commented Feb 25, 2023

honggyukim commented Feb 25, 2023

namhyung commented Feb 25, 2023

namhyung commented Feb 26, 2023

DanielTimLee commented Oct 8, 2018 •

edited

Loading

claudiajkang commented Oct 14, 2018 •

edited

Loading

DanielTimLee commented Oct 14, 2018 •

edited

Loading

namhyung commented Dec 16, 2018 •

edited

Loading

quark-zju commented Dec 29, 2018 •

edited

Loading

quark-zju commented Dec 29, 2018 •

edited

Loading

quark-zju commented Dec 31, 2018 •

edited

Loading

honggyukim commented Feb 13, 2019 •

edited

Loading

honggyukim commented Feb 17, 2019 •

edited

Loading

honggyukim commented Feb 17, 2019 •

edited

Loading

honggyukim commented Feb 23, 2019 •

edited

Loading

honggyukim commented Feb 7, 2023 •

edited

Loading