Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

python function tracing support #436

Closed
namhyung opened this issue Aug 8, 2018 · 81 comments
Closed

python function tracing support #436

namhyung opened this issue Aug 8, 2018 · 81 comments

Comments

@namhyung
Copy link
Owner

namhyung commented Aug 8, 2018

This is a RFC for tracing python function using python's sys.settrace() hook. I implemented C extension module to be called from python and it calls __cyg_profile_func_enter/exit() like a normal C program. But python functions don't have fixed address so I just used a sequence counter to give it the address. uftrace will treat any functions that have address under 4K as python functions.

As it's just a PoC, I don't expect it can run serious programs and it'd contain many rough edges. Also filters don't work on the python functions. The --python option was added tentatively to work around issues on the python script as an executable. Maybe it'd be nice if it could detect and do the right things automatically.

I only tested it on my Archlinux machine but it doesn't have PLT in the python interpreter. So it only shows python functions but I expect it can also show (library) functions in the interpreter too.

I pushed the code to review/python-tracing-v1 branch. Have fun :)

Here's my test program (test.py):

#!/usr/bin/env python2
def b(arg):
    return arg+1
def a():
    return b(0)
print a()

You can run it like below:

$ uftrace --python test.py
1
# DURATION     TID     FUNCTION
            [   754] | <module>() {
            [   754] |   a() {
   0.911 us [   754] |     b();
   3.768 us [   754] |   } /* a */
  11.222 us [   754] | } /* <module> */
   6.015 us [   754] | _remove();
   2.397 us [   754] | _remove();
@namhyung namhyung added the review label Aug 8, 2018
@namhyung
Copy link
Owner Author

namhyung commented Aug 9, 2018

tested on an Ubuntu 16.04 box:

$ uftrace record --python test.py

# uftrace replay -t 1us -F '<module>'
# DURATION     TID     FUNCTION
            [  1236] | <module>() {
            [  1236] |   a() {
   4.963 us [  1236] |     b();
  16.306 us [  1236] |   } /* a */
  11.269 us [  1236] |   __fprintf_chk();
  17.109 us [  1236] |   fputs();
  81.222 us [  1236] | } /* <module> */

It seems filter is needed for replay..

@honggyukim
Copy link
Collaborator

Looks great! It works fine in my environment but it shows too many library calls.

$ uftrace --python hello.py
1
# DURATION     TID     FUNCTION
   1.710 us [188685] | strcmp();
   0.337 us [188685] | strcmp();
   1.610 us [188685] | strchr();
   2.457 us [188685] | getenv();
   0.187 us [188685] | strcmp();
   0.150 us [188685] | strcmp();
   0.220 us [188685] | strchr();
   1.513 us [188685] | malloc();
   1.507 us [188685] | memcpy();
   0.560 us [188685] | getenv();
   0.423 us [188685] | getenv();
   0.296 us [188685] | getenv();
   0.330 us [188685] | getenv();
   1.200 us [188685] | fileno();
   3.964 us [188685] | isatty();
   0.480 us [188685] | getenv();
   0.434 us [188685] | getenv();
   0.317 us [188685] | getenv();
   0.370 us [188685] | getenv();
   0.317 us [188685] | getenv();
        ...

So I need to use --no-libcall to see the same result.

$ uftrace --no-libcall --python hello.py
1
# DURATION     TID     FUNCTION
            [188643] | <module>() {
            [188643] |   a() {
   2.354 us [188643] |     b();
   7.917 us [188643] |   } /* a */
  28.287 us [188643] | } /* <module> */
  12.330 us [188643] | _remove();
   5.257 us [188643] | _remove();

@namhyung
Copy link
Owner Author

namhyung commented Aug 9, 2018

Yes, that's why I added -F <module> -t 1us filters. But --no-libcall almost removes values of using uftrace for python IMO.

@DanielTimLee
Copy link
Contributor

DanielTimLee commented Oct 8, 2018

Just an idea,
Are there any chances to make use of cProfile which is python's native profiler?

Well, it might be unpleasant having a dependency on an external module,
but using a native profiler might be having fewer problems with tracing, I guess.

@namhyung
Copy link
Owner Author

I don't know about python profiling internals. But if there's a better way for uftrace, why not? :)

@claudiajkang
Copy link
Contributor

claudiajkang commented Oct 14, 2018

I tested test.py in my environment and I found two issues.

First, when make, if kernel-debug-devel package is not installed, the error occurred.

uftrace detected system features:
...         prefix: /usr/local
...         libelf: [ on  ] - more flexible ELF data handling
...          libdw: [ on  ] - DWARF debug info support
...   libpython2.7: [ on  ] - python scripting support
...    libncursesw: [ on  ] - TUI support
...   cxa_demangle: [ on  ] - full demangler support with libstdc++

......

  GEN      version.h
  CC       uftrace.o
  LINK     uftrace
/home/claudia/uftrace/cmds/info.o: In function `fill_exe_build_id':
/home/claudia/uftrace/cmds/info.c:98: undefined reference to `elf_getshdrstrndx'
/home/claudia/uftrace/cmds/info.c:98: undefined reference to `elf_nextscn'
/home/claudia/uftrace/cmds/info.c:98: undefined reference to `elf_nextscn'
/home/claudia/uftrace/cmds/info.c:98: undefined reference to `gelf_getshdr'
/home/claudia/uftrace/cmds/info.c:105: undefined reference to `elf_strptr'
/home/claudia/uftrace/cmds/info.c:117: undefined reference to `elf_getdata'
/home/claudia/uftrace/cmds/info.c:117: undefined reference to `gelf_getnote'
/home/claudia/uftrace/cmds/tui.o: In function `tui_window_move_down':
/home/claudia/uftrace/cmds/tui.c:1785: undefined reference to `LINES'
/home/claudia/uftrace/cmds/tui.c:1785: undefined reference to `LINES'
/home/claudia/uftrace/cmds/tui.o: In function `win_header_report':
/home/claudia/uftrace/cmds/tui.c:1362: undefined reference to `printw'
/home/claudia/uftrace/cmds/tui.c:1363: undefined reference to `COLS'
/home/claudia/uftrace/cmds/tui.o: In function `win_display_info':
/home/claudia/uftrace/cmds/tui.c:1557: undefined reference to `COLS'
/home/claudia/uftrace/cmds/tui.o: In function `win_header_info':
/home/claudia/uftrace/cmds/tui.c:1514: undefined reference to `COLS'
/home/claudia/uftrace/cmds/tui.o: In function `win_header_session':
/home/claudia/uftrace/cmds/tui.c:1621: undefined reference to `COLS'
....
collect2: error: ld returned 1 exit status
make: *** [/home/claudia/uftrace/uftrace] 오류 1

Second, when I ran test.py from uftrace, the error occurred because of elf format issue.

$ uftrace --python ../test.py 
WARN: Segmentation fault
uftrace: /home/claudia/uftrace/cmds/record.c:387:fill_file_header
 ERROR: cannot open info file: No such file or directory
child terminated by signal: 11: Segmentation fault

I think that the reason is [vdso] section in test.py's process map.

(gdb) !cat proc/<pid>/maps
...
7ffe75d87000-7ffe75d89000 r-xp 00000000 00:00 0                          [vdso]

So I suggest that ELF parsing logic in this PR needs to be updated.

@DanielTimLee
Copy link
Contributor

DanielTimLee commented Oct 14, 2018

(+ It keeps search uftrace python module from the compiled directory.)

ex)

# trying /home/manjaro/git/uf2/misc/uftrace.so
# trying /home/manjaro/git/uf2/misc/uftracemodule.so
# trying /home/manjaro/git/uf2/misc/uftrace.py
# trying trace_python.so
# trying trace_pythonmodule.so
# trying trace_python.py
# trying trace_python.pyc
# trying /home/manjaro/git/uf2/misc/trace_python.so
dlopen("/home/manjaro/git/uf2/misc/trace_python.so", 2);
import trace_python # dynamically loaded from /home/manjaro/git/uf2/misc/trace_python.so

which means, When removing the compiled directory, it won't work at all.

@DanielTimLee
Copy link
Contributor

(+ Due to python's module import logic, Maybe the trace result of import should be hidden with options.)

Example code

from sqlalchemy import create_engine
engine = create_engine('sqlite:///:memory:', echo=True)

This example imports sqlalchemy ORM and provisions SQLite memory DB.

$ uftrace --python test.py
# DURATION     TID     FUNCTION
            [ 31701] | <module>() {
            [ 31701] |   <module>() {
            [ 31701] |     <module>() {
            [ 31701] |       <module>() {
            [ 31701] |         <module>() {
            [ 31701] |           <module>() {
  25.046 us [ 31701] |             <module>();
 764.568 us [ 31701] |             <module>();
  21.142 us [ 31701] |             OrderedDict();
   9.359 us [ 31701] |             Counter();
   2.261 ms [ 31701] |           } /* <module> */
            [ 31701] |           <module>() {
            [ 31701] |             <module>() {
            [ 31701] |               <module>() {
 356.349 us [ 31701] |                 <module>();
   2.810 us [ 31701] |                 GeneratorContextManager();
            [ 31701] |                 contextmanager() {
   4.267 us [ 31701] |                   wraps();
  18.103 us [ 31701] |                   update_wrapper();
  33.232 us [ 31701] |                 } /* contextmanager */
   2.664 us [ 31701] |                 closing();
 563.707 us [ 31701] |               } /* <module> */
            [ 31701] |               <module>() {
            [ 31701] |                 filterwarnings() {
            [ 31701] |                   compile() {
            [ 31701] |                     _compile() {
   3.774 us [ 31701] |                       isstring();
            [ 31701] |                       compile() {
   2.633 us [ 31701] |                         isstring();
            [ 31701] |                         parse() {
            [ 31701] |                           __init__() {
   7.184 us [ 31701] |                             __next();
  12.869 us [ 31701] |                           } /* __init__ */
   3.479 us [ 31701] |                           __init__();
            [ 31701] |                           _parse_sub() {
            [ 31701] |                             _parse() {
   3.547 us [ 31701] |                               __init__();
            [ 31701] |                               get() {
   5.965 us [ 31701] |                                 __next();
  10.091 us [ 31701] |                               } /* get */
   2.023 us [ 31701] |                               append();
            [ 31701] |                               get() {
   4.505 us [ 31701] |                                 __next();
   7.912 us [ 31701] |                               } /* get */
   1.372 us [ 31701] |                               append();
            [ 31701] |                               get() {
   4.787 us [ 31701] |                                 __next();
   7.777 us [ 31701] |                               } /* get */
   1.154 us [ 31701] |                               append();
$ uftrace -D 2 --python test.py
# DURATION     TID     FUNCTION
            [  4991] | <module>() {
 179.633 ms [  4991] |   <module>();
  12.627 ms [  4991] |   create_engine();
 192.395 ms [  4991] | } /* <module> */
            [  4991] | _exitfunc() {
  14.307 us [  4991] |   __stop();
  13.025 us [  4991] |   _pickSomeNonDaemonThread();
   0.967 us [  4991] |   _note();
   1.976 us [  4991] |   __delete();
  37.923 us [  4991] | } /* _exitfunc */
            [  4991] | _run_exitfuncs() {
  47.101 us [  4991] |   shutdown();
  53.613 us [  4991] | } /* _run_exitfuncs */
   5.729 us [  4991] | _remove();
   1.544 us [  4991] | _remove();

Most of the trace results are module import logic, and I think import logic looks aren't that necessary by default with tracing.

(Ps. tracing should provide much more info than <module>, how about showing with module path?)

@namhyung
Copy link
Owner Author

namhyung commented Dec 16, 2018

Pushed review/python-tracing-v2

Changelog:

  • fix segfault due to broken debug info setup
  • enable debug logging
  • get module name for <module> functions
$ uftrace --python test2.py | head
# DURATION     TID     FUNCTION
            [  8982] | <module:__main__>() {
            [  8982] |   <module:sqlalchemy>() {
            [  8982] |     <module:sqlalchemy.sql>() {
            [  8982] |       <module:sqlalchemy.sql.expression>() {
            [  8982] |         <module:sqlalchemy.sql.visitors>() {
            [  8982] |           <module:collections>() {
  13.548 us [  8982] |             <module:keyword>();
 600.704 us [  8982] |             <module:heapq>();
  14.237 us [  8982] |             OrderedDict();

@honggyukim
Copy link
Collaborator

Thanks for the update. I just tested it with a python version of t-fork, but it needs to fix the depth for os.fork.

$ uftrace --python --no-libcall p-fork.py
# DURATION     TID     FUNCTION
            [148263] | <module:__main__>() {
            [148263] |   main() {
            [148263] |     /* linux:sched-out */
            [148274] | a() {
            [148274] |   b() {
  45.707 us [148274] |     c();
  69.408 us [148274] |   } /* b */
 106.975 us [148274] | } /* a */
 109.289 us [148274] | } /* main */
 117.109 us [148274] | } /* <module:__main__> */
  12.327 us [148274] | _remove();
   5.727 us [148274] | _remove();
   7.730 ms [148263] |     /* linux:sched-in */
            [148263] |     a() {
            [148263] |       b() {
  10.984 us [148263] |         c();
  24.301 us [148263] |       } /* b */
  34.991 us [148263] |     } /* a */
   8.435 ms [148263] |   } /* main */
   8.456 ms [148263] | } /* <module:__main__> */
  12.163 us [148263] | _remove();
   5.090 us [148263] | _remove();

Here is the test program, p-fork.py.

#!/usr/bin/env python

import os

def c():
    return os.getpid() % 100000

def b():
    return c() + 1

def a():
    return b() - 1

def main():
    ret = 0
    pid = os.fork()
    if pid:
        os.wait()

    ret += a()

    return ret

if __name__=='__main__':
    main()

@quark-zju
Copy link

quark-zju commented Dec 29, 2018

Regarding on cProfile, it uses the same PyEval_SetProfile API under the hood.

One disadvantage of cProfile that does not exist in uftrace is, cProfile prints a flat list of functions, and it's hard to see the relationship between functions. For example, if one code path calls read(1) and another code path calls calls read(1e10), it's hard to find the slow code path just from the cProfile output. I also had a quick-and-dirty profiler to workaround cProfile limitation.

(EDITED: Removed unrelated content)

@quark-zju
Copy link

quark-zju commented Dec 29, 2018

Some comments on review/python-tracing-v2. Mostly about reducing overhead:

  • Use PyEval_SetProfile to reduce overhead (and avoid PyObject_GetAttrString accordingly by accessing Python structs directly like ((PyFrameObject *)frame)->f_code).
  • Use the address of PyCodeObject as the function address. I think this is what cProfile uses and I have been using the same approach in traceprof (mentioned above) - it worked pretty well and is much faster than manually maintaining the function address mapping. You'll need to Py_INCREF the code object to keep them alive.
  • Since libpython is already a required dependency, consider "inline" the Python extension, by calling Py_Initialize, followed by PyEval_SetProfile, then PyRun_SimpleFile to run a script.

@namhyung
Copy link
Owner Author

@quark-zju thanks for detailed comments, I'll take a look at them to reduce overheads. Note that libpython is not a required dependency of uftrace (and I don't want to make it is) and we load it with dlopen() if needed. But it's a different situaltion so ok to use PyEval_SetProfile() IMO.

@namhyung
Copy link
Owner Author

@honggyukim yes it doesn't consider multi-process fully yet - but I guess it's mostly for works during replay.

@quark-zju
Copy link

Thanks! Making libpython optional was a wise decision.

It seems the next missing thing would be exception handling. It can probably be done by following f_back from PyFrameObject recursively to keep track the Python stack and emulate those missed "exit"s like mcount_rstack_reset_exception.

@namhyung
Copy link
Owner Author

Probably, but is there a way for python interpreter to let us know about the exception info?

@quark-zju
Copy link

quark-zju commented Dec 31, 2018

Actually, I was wrong. According to https://github.com/python/cpython/blob/62be74290aca26d16f3f55ece7ff6dad14e60e8d/Modules/_lsprof.c#L454, PyTrace_RETURN will be generated, so there is no need to specially handling exceptions. But arg should also provide the exception information in the PyTrace_EXCEPTION case. The profiler I wrote was a quick-and-dirty work and I didn't pay close attention to exception handling.

@honggyukim
Copy link
Collaborator

honggyukim commented Feb 13, 2019

I was trying to test it for OE bitbake script, but it doesn't accept additional arguments for python script.

Here is a test script.

$ cat arg-test.py
#!/usr/bin/env python
import sys
print(sys.argv)
$ ./arg-test.py
['./arg-test.py']

$ ./arg-test.py --arg1
['./arg-test.py', '--arg1']

$ ./arg-test.py --arg1 --arg2
['./arg-test.py', '--arg1', '--arg2']

But the result is not same as original execution.

$ uftrace record --no-libcall --python ./arg-test.py
['/home/honggyu/work/uftrace/misc/uftrace.py', './arg-test.py']

$ uftrace record --no-libcall --python ./arg-test.py --arg1
['/home/honggyu/work/uftrace/misc/uftrace.py', './arg-test.py']

$ uftrace record --no-libcall --python ./arg-test.py --arg1 --arg2
['/home/honggyu/work/uftrace/misc/uftrace.py', './arg-test.py']

The result doesn't show --arg1 and --arg2 in the output.

@namhyung
Copy link
Owner Author

Pushed review/python-tracing-v3

Changelog:

  • handle command line argument

It was easy thanks to recent script file support but I think we need to consider supporting python3 as well.

$ uftrace record --python ./arg-test.py 1 2 3
['/home/namhyung/project/uftrace/misc/uftrace.py', './arg-test.py', '1', '2', '3']

@honggyukim
Copy link
Collaborator

honggyukim commented Feb 17, 2019

Thanks for the update! I just updated to make both python2 and python3 work based on review/python-tracing-v3.

I pushed it into honggyukim/review/python3-tracing-v1.

Here are test scripts.

$ cat p-abc.py
#!/usr/bin/env python
import os
def c():
    return os.getpid()
def b():
    return c()
def a():
    return b()
print(a())

The below script is same as above, but just uses python3.

$ cat p-abc3.py
#!/usr/bin/env python3
import os
def c():
    return os.getpid()
def b():
    return c()
def a():
    return b()
print(a())

It works both python2 and python3 scripts as follows.

$ uftrace --no-libcall --python p-abc.py
171054
# DURATION     TID     FUNCTION
            [171054] | <module:__main__>() {
            [171054] |   a() {
            [171054] |     b() {
   3.380 us [171054] |       c();
   7.413 us [171054] |     } /* b */
  10.427 us [171054] |   } /* a */
  49.587 us [171054] | } /* <module:__main__> */
  10.977 us [171054] | _remove();
   4.677 us [171054] | _remove();
$ uftrace --no-libcall --python p-abc3.py
171068
# DURATION     TID     FUNCTION
  13.120 us [171068] | getpreferredencoding();
            [171068] | __init__() {
   3.143 us [171068] |   __init__();
  11.880 us [171068] | } /* __init__ */
  15.984 us [171068] | decode();
            [171068] | <module:__main__>() {
            [171068] |   a() {
            [171068] |     b() {
   3.194 us [171068] |       c();
   8.030 us [171068] |     } /* b */
  12.003 us [171068] |   } /* a */
  60.443 us [171068] | } /* <module:__main__> */

Please take a look. Thanks!

@honggyukim
Copy link
Collaborator

honggyukim commented Feb 17, 2019

@chbae I also tested it for poky bitbake script, which is written in python3 and the tracing result looks fine although it needs to fix some symbol names.

$ uftrace record --no-libcall --python ./bitbake/bin/bitbake zip-native
   ...
$ uftrace replay -t 2s
# DURATION     TID     FUNCTION
            [171177] | <module:__main__>() {
            [171177] |   _find_and_load() {
            [171177] |     _find_and_load_unlocked() {
            [171177] |       _load_unlocked() {
            [171177] |         exec_module() {
            [171177] |           _call_with_frames_removed() {
            [171177] |             <module:bb>() {
            [171177] |               _handle_fromlist() {
            [171177] |                 _call_with_frames_removed() {
            [171177] |                   _find_and_load() {
            [171177] |                     _find_and_load_unlocked() {
            [171177] |                       _load_unlocked() {
            [171177] |                         exec_module() {
            [171177] |                           _call_with_frames_removed() {
   2.526  s [171177] |                             <module:bb.fetch2>();
   2.526  s [171177] |                           } /* _call_with_frames_removed */
   2.527  s [171177] |                         } /* exec_module */
   2.527  s [171177] |                       } /* _load_unlocked */
   2.528  s [171177] |                     } /* _find_and_load_unlocked */
   2.528  s [171177] |                   } /* _find_and_load */
   2.528  s [171177] |                 } /* _call_with_frames_removed */
   2.528  s [171177] |               } /* _handle_fromlist */
   2.757  s [171177] |             } /* <module:bb> */
   2.757  s [171177] |           } /* _call_with_frames_removed */
   2.757  s [171177] |         } /* exec_module */
   2.757  s [171177] |       } /* _load_unlocked */
   2.758  s [171177] |     } /* _find_and_load_unlocked */
   2.758  s [171177] |   } /* _find_and_load */
            [171177] |   bitbake_main() {
            [171177] |     setup_bitbake() {
   2.004  s [171177] |       __init__();
            [171197] | <6e9>() {
            [171197] |   <6bb>() {
            [171197] |     <6ed>() {
            [171197] |       /* linux:task-name (name=Cooker) */
   2.040  s [171177] |     } /* setup_bitbake */
            [171177] |     main() {
            [171177] |       /* linux:task-name (name=KnottyUI) */
            [171197] |       <6f4>() {
            [171197] |         <721>() {
            [171197] |           <722>() {
            [171197] |             <723>() {
   2.592  s [171197] |               __init__();
   3.065  s [171197] |             } /* <723> */
   3.066  s [171197] |           } /* <722> */
   3.066  s [171197] |         } /* <721> */
   3.068  s [171197] |       } /* <6f4> */
   7.695  s [171177] |     } /* main */
   9.735  s [171177] |   } /* bitbake_main */
  12.691  s [171177] | } /* <module:__main__> */
   7.847  s [171197] |     } /* <6ed> */
   7.849  s [171197] |   } /* <6bb> */
   7.850  s [171197] | } /* <6e9> */

@honggyukim
Copy link
Collaborator

It'd be really great if we can also add source location info for python functions.

@honggyukim
Copy link
Collaborator

honggyukim commented Feb 23, 2019

@namhyung I just created a PR at #680 maybe for easier review. You don't have to make comment on the PR, but might be simpler to leave inline review comments.

@honggyukim
Copy link
Collaborator

It seems that we can also provide Python C API functions as built-in functions that can be processed by --auto-args. At least, const char* typed functions can be seen a lot better than no args.

@honggyukim
Copy link
Collaborator

Thanks. I don't see the problem WARN: call depth beyond 1024 is not recorded with v11. The nesting depth looks good now.

defer sys.setprofile() to skip unrelated functions

It looks good but can we add an option not to defer sys.setprofile() just like we have --kernel-full.

I still see difference when running pip install pre-commit without giving full path of pip.

$ uftrace record pip install pre-commit
pip: can't open file '/home/honggyu/work/uftrace/git/python/install': [Errno 2] No such file or directory

I still need to give a full path with v11 as follows.

$ uftrace record `which pip` install pre-commit
Defaulting to user installation because normal site-packages is not writeable
Requirement already satisfied: pre-commit in /home/honggyu/.local/lib/python3.10/site-packages (2.21.0)
Requirement already satisfied: cfgv>=2.0.0 in /home/honggyu/.local/lib/python3.10/site-packages (from pre-commit) (3.3.1)
Requirement already satisfied: identify>=1.0.0 in /home/honggyu/.local/lib/python3.10/site-packages (from pre-commit) (2.5.11)
Requirement already satisfied: pyyaml>=5.1 in /usr/lib/python3/dist-packages (from pre-commit) (5.4.1)
Requirement already satisfied: virtualenv>=20.10.0 in /home/honggyu/.local/lib/python3.10/site-packages (from pre-commit) (20.17.1)
Requirement already satisfied: nodeenv>=0.11.1 in /home/honggyu/.local/lib/python3.10/site-packages (from pre-commit) (1.7.0)
Requirement already satisfied: setuptools in /usr/lib/python3/dist-packages (from nodeenv>=0.11.1->pre-commit) (59.6.0)
Requirement already satisfied: filelock<4,>=3.4.1 in /home/honggyu/.local/lib/python3.10/site-packages (from virtualenv>=20.10.0->pre-commit) (3.9.0)
Requirement already satisfied: distlib<1,>=0.3.6 in /home/honggyu/.local/lib/python3.10/site-packages (from virtualenv>=20.10.0->pre-commit) (0.3.6)
Requirement already satisfied: platformdirs<3,>=2.4 in /home/honggyu/.local/lib/python3.10/site-packages (from virtualenv>=20.10.0->pre-commit) (2.6.2)

@honggyukim
Copy link
Collaborator

In addition, it would be really great if we can provide source location info for python code as well. Then we can jump to the actual python source code in tui command. Would it be possible?

@namhyung
Copy link
Owner Author

namhyung commented Feb 6, 2023

Yep, I think it's doable. I'll leave it as the next step.

@namhyung
Copy link
Owner Author

namhyung commented Feb 6, 2023

It looks good but can we add an option not to defer sys.setprofile() just like we have --kernel-full.

I'm not sure about it. Anyway what it shows is the implementation of our loader script uftrace.py, not the general python startup code.

@namhyung
Copy link
Owner Author

namhyung commented Feb 6, 2023

I still see difference when running pip install pre-commit without giving full path of pip.

Ok, I found a problem dealing with script files. It should check PATH env to get the full pathname when accessing the file.

@namhyung
Copy link
Owner Author

namhyung commented Feb 6, 2023

Pushed review/python-tracing-v12

Changes:

  • handle pathname of script using PATH environ variable
  • do not skip the first frame for Python2

Now it can run pip script without the full pathname!!

$ uftrace record pip install pre-commit
Defaulting to user installation because normal site-packages is not writeable
Requirement already satisfied: pre-commit in /usr/lib/python3/dist-packages (2.20.0)

@honggyukim
Copy link
Collaborator

Pushed review/python-tracing-v12

It looks great. Thanks.

I'm not sure about it. Anyway what it shows is the implementation of our loader script uftrace.py, not the general python startup code.

Then it's okay to skip it.

One more thing is that we added trace-python.c and uftrace.py in misc directory, but I feel we're adding too many different stuffs under misc. How about create a new directory python for python tracing?

@honggyukim
Copy link
Collaborator

One more thing is if #1605 is merged before this, then we should use uftrace_shm_open instead of direct shm_open. Likewise, shm_unlink has to be changed as well.

@namhyung
Copy link
Owner Author

namhyung commented Feb 7, 2023

One more thing is that we added trace-python.c and uftrace.py in misc directory, but I feel we're adding too many different stuffs under misc. How about create a new directory python for python tracing?

Good idea. Will move.

One more thing is if #1605 is merged before this, then we should use uftrace_shm_open instead of direct shm_open. Likewise, shm_unlink has to be changed as well.

Yep, I plan to merge it before this work. I'll take care of the change.

@honggyukim
Copy link
Collaborator

It works fine with pre-commit run -a as follows.

$ uftrace record pre-commit run -a
Check Yaml...............................................................Passed
Fix End of Files.........................................................Passed
Trim Trailing Whitespace.................................................Passed
isort....................................................................Passed
clang-format.............................................................Passed
codespell................................................................Passed

If I run uftrace tui, it shows multiple graph sessions.

Key uftrace command
 G  call Graph for session #1: python3.10
    call Graph for session #2: python3.10
    call Graph for session #3: python3.10
    call Graph for session #4: python3.10
    call Graph for session #5: python3.10
    call Graph for session #6: python3.10
    call Graph for session #7: python3.10
    call Graph for session #8: python3.10
    call Graph for session #9: python3.10
    call Graph for session #10: python3.10
    call Graph for session #11: python3.10
    call Graph for session #12: python3.10
    call Graph for session #13: python3.10
    call Graph for session #14: python3.10
    call Graph for session #15: python3.10
    call Graph for session #16: python3.10
    call Graph for session #17: python3.10
    call Graph for session #18: python3.10
    call Graph for session #19: python3.10
    call Graph for session #20: python3.10
    call Graph for session #21: python3.10
    call Graph for session #22: python3.10
 R  Report functions
 I  uftrace Info
 h  Help message
 q  quit

But most of the graph sessions are empty except for the first session. I'm just wondering if it correctly follows the underline python scripts that run each pre-commit hooks.

@honggyukim
Copy link
Collaborator

honggyukim commented Feb 7, 2023

Another test with b4 works fine.

$ uftrace record b4 am https://lore.kernel.org/all/20221219201732.460111-7-namhyung@kernel.org/
Analyzing 10 messages in the thread                                         
Checking attestation on all messages, may take a moment...
---                                                                         
  ✓ [PATCH 1/6] perf lock contention: Factor out lock_type_table
  ✓ [PATCH 2/6] perf lock contention: Add -Y/--type-filter option
    + Acked-by: Namhyung Kim <namhyung@kernel.org>                  
  ✓ [PATCH 3/6] perf lock contention: Support lock type filtering for BPF
  ✓ [PATCH 4/6] perf lock contention: Add -L/--lock-filter option
  ✓ [PATCH 5/6] perf lock contention: Support lock addr/name filtering for BPF
  ✓ [PATCH 6/6] perf test: Update perf lock contention test       
  ---                                                                       
  ✓ Signed: DKIM/gmail.com (From: namhyung@kernel.org)         
  ---                                                                       
  NOTE: install patatt for end-to-end signature verification
---                                  
Total patches: 6  
---                                                                         
Cover: ./20221219_namhyung_perf_lock_contention_add_more_filter_options_v1.cover
 Link: https://lore.kernel.org/r/20221219201732.460111-1-namhyung@kernel.org 
 Base: 51c4f2bf5397b34b79a6712221606e0ab2e6f7ed
       git checkout -b 20221219_namhyung_kernel_org 51c4f2bf5397b34b79a6712221606e0ab2e6f7ed
       git am ./20221219_namhyung_perf_lock_contention_add_more_filter_options_v1.mbx
$ uftrace tui -t 10ms
  TOTAL TIME : FUNCTION
  838.369 ms : (1) python3.10
  838.369 ms : (1) __main__.<module>
   67.895 ms :  ├▶(1) importlib._bootstrap._find_and_load
             :  │
  344.847 ms :  ├▶(1) importlib_load_entry_point
             :  │
  418.234 ms :  └─(1) b4.command.cmd
  394.371 ms :    (1) b4.command.cmd_am
  388.535 ms :    (1) b4.mbox.main
   70.639 ms :     ├▶(1) b4.mbox.get_msgs
             :     │
  317.877 ms :     └─(1) b4.mbox.make_am
   25.043 ms :        ├▶(2) b4.add_message
             :        │
  214.915 ms :        ├─(1) b4.get_am_ready
  121.689 ms :        │  ├─(6) b4.get_attestation_trailers
  121.118 ms :        │  │ (6) b4.attestors
  119.869 ms :        │  │ (6) b4._load_dkim_attestors
   70.559 ms :        │  │  ├─(6) email.message.as_bytes
   70.481 ms :        │  │  │ (6) email.generator.flatten
   70.407 ms :        │  │  │ (6) email.generator._write
   40.920 ms :        │  │  │ (4) email.generator._write_headers
             :        │  │  │
   13.018 ms :        │  │  └▶(1) dkim.verify
             :        │  │
   83.517 ms :        │  └▶(6) b4.get_am_message
             :        │
   18.807 ms :        ├─(1) b4.save_git_am_mbox
             :        │
   13.855 ms :        └▶(1) b4.save_cover

@namhyung
Copy link
Owner Author

namhyung commented Feb 8, 2023

I'm just wondering if it correctly follows the underline python scripts that run each pre-commit hooks.

Probably not. It needs to add -m uftrace_python option to the Python interpreter (if it's called from a command line). We do it in the record command for a script we invoke. But I'm not sure how to handle if it calls other external scripts.

@honggyukim
Copy link
Collaborator

Probably not. It needs to add -m uftrace_python option to the Python interpreter (if it's called from a command line). We do it in the record command for a script we invoke. But I'm not sure how to handle if it calls other external scripts.

It would be useful if there is an environment variable to add a module by default but I don't have an idea how to handle it.

I see another bugs when running runtest.py for multiple tests as follows.

$ uftrace record ./runtest.py 00[1-5]
Start 5 tests with 5 worker
Compiler                  gcc                                           clang                                       
Test case                 pg             finstrument-fu fpatchable-fun  pg             finstrument-fu fpatchable-fun
------------------------: O0 O1 O2 O3 Os O0 O1 O2 O3 Os O0 O1 O2 O3 Os  O0 O1 O2 O3 Os O0 O1 O2 O3 Os O0 O1 O2 O3 Os
001 basic               : OK OK OK OK OK OK OK OK OK OK SG SG SG SG SG  OK OK OK OK OK OK OK OK OK OK SG SG SG SG SG
002 argument            : OK OK OK OK OK OK OK OK OK OK SG SG SG SG SG  OK OK OK OK OK OK OK OK OK OK SG SG SG SG SG
003 thread              : OK OK OK OK OK OK OK OK OK OK SG SG SG SG SG  OK OK OK OK OK OK OK OK OK OK SG SG SG SG SG
004 filter_F            : NG NG NG NG NG NG NG NG NG NG SG SG SG SG SG  NG NG NG NG NG NG NG NG NG NG SG SG SG SG SG
005 filter_N            : NG NG NG NG NG NG NG NG NG NG SG SG SG SG SG  NG NG NG NG NG NG NG NG NG NG SG SG SG SG SG


runtime test stats
====================
total   150  Tests executed (success: 40.00%)
  OK:    60  Test succeeded
  OK:     0  Test succeeded (with some fixup)
  NG:    40  Different test result
  NZ:     0  Non-zero return value
  SG:    50  Abnormal exit by signal
  TM:     0  Test ran too long
  BI:     0  Build failed
  LA:     0  Unsupported Language
  SK:     0  Skipped
WARN: Segmentation fault: address not mapped (addr: 0x7fd5252c8154)
WARN:  if this happens only with uftrace, please consider -e/--estimate-return option.

WARN: Backtrace from uftrace v0.13.1-23-gb8b4 ( x86_64 dwarf python3 luajit tui perf sched dynamic )
WARN: =====================================
WARN: [14] (<2ca9>[2ca9] <= <0>[0])
WARN: [13] (<2d6b>[2d6b] <= <0>[0])
WARN: [12] (<318f>[318f] <= <0>[0])
WARN: [11] (<3189>[3189] <= <0>[0])
WARN: [10] (<3188>[3188] <= <0>[0])
WARN: [9] (<30fc>[30fc] <= <0>[0])
WARN: [8] (<2c9d>[2c9d] <= <0>[0])
WARN: [7] (<2c99>[2c99] <= <0>[0])
WARN: [6] (<2c6a>[2c6a] <= <0>[0])
WARN: [5] (<2c66>[2c66] <= <0>[0])
WARN: [4] (<2e74>[2e74] <= <0>[0])
WARN: [3] (<2e73>[2e73] <= <0>[0])
WARN: [2] (<2de3>[2de3] <= <0>[0])
WARN: [1] (<2de1>[2de1] <= <0>[0])
WARN: [0] (<1>[1] <= <0>[0])

Please report this bug to https://github.com/namhyung/uftrace/issues.

It didn't see the problem when running 1 or 2 tests but increasing the number of tests makes this problem.

@honggyukim
Copy link
Collaborator

And the following test shows NG.

$ uftrace record ./runtest.py -p -c gcc -O0 004
Start 1 tests without worker pool
Compiler                  gc
Test case                 pg
------------------------: O0
004 filter_F            : NG

@namhyung
Copy link
Owner Author

The runtest can be tricky since it runs uftrace under uftrace. The options are passed by env so it can be affected by outer uftrace.

@honggyukim
Copy link
Collaborator

OK. Then please ignore the result of runtest.py.

@namhyung
Copy link
Owner Author

Pushed review/python-tracing-v13

Changes:

  • move the python support code to python/ directory
  • use the new shmem APIs
  • add a unit test case

@honggyukim
Copy link
Collaborator

Thanks for the update. It looks good but it would be better if we can run some python feature tests.

It looks cpython internally has some tests and one of the test shows some failures when recording with uftrace.

$ git clone https://github.com/python/cpython.git
$ cd cpython/Lib/test

And apply the following change to make it running by uftrace.

diff --git a/Lib/test/test_array.py b/Lib/test/test_array.py
index 5b2c107a60..eb080d1e06 100755
--- a/Lib/test/test_array.py
+++ b/Lib/test/test_array.py
@@ -1,3 +1,4 @@
+#!/usr/bin/env python
 """Test the arraymodule.
    Roger E. Masse
 """

Then the test works fine showing OK at the end.

$ ./test_array.py
.....................................................................................................................................................................................
.....................................................................................................................................................................................
.....................................................................................................................................................................................
.....................................................................................................................................................................................
.........................................................................................
----------------------------------------------------------------------
Ran 813 tests in 0.242s

OK

I tried to run it with uftrace and it shows some failures at the end of execution as follows.

$ uftrace record ./test_array.py
...............................................................F............................................................F........................................................
...F............................................................F..........................................................................F.........................................
...................F.................................................................F......................................................F........................................
.....................F.............................................................F.............................................................F...................................
..........................F.............................................................F
======================================================================
FAIL: test_weakref (__main__.ByteTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "<string>", line 1103, in test_weakref
AssertionError: ReferenceError not raised by len

======================================================================
FAIL: test_weakref (__main__.DoubleTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "<string>", line 1103, in test_weakref
AssertionError: ReferenceError not raised by len

======================================================================
FAIL: test_weakref (__main__.FloatTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "<string>", line 1103, in test_weakref
AssertionError: ReferenceError not raised by len
    ...
======================================================================
FAIL: test_weakref (__main__.UnsignedShortTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "<string>", line 1103, in test_weakref
AssertionError: ReferenceError not raised by len

----------------------------------------------------------------------
Ran 813 tests in 1.333s

FAILED (failures=13)

@honggyukim
Copy link
Collaborator

I see a segfault this time in cpython test.

But this test only works fine when I checkout cpython repo to the same version of my running python v3.10.6, then applied the following change.

diff --git a/Lib/test/test_uuid.py b/Lib/test/test_uuid.py
index d6a8333427..11ee1e9c6d 100755
--- a/Lib/test/test_uuid.py
+++ b/Lib/test/test_uuid.py
@@ -1,3 +1,4 @@
+#!/usr/bin/env python
 import unittest
 from test import support
 from test.support import import_helper

The test runs fine.

$ ./test_uuid.py
s...s.ss....ss...s.ss............................s.sss.s....
----------------------------------------------------------------------
Ran 60 tests in 0.037s

OK (skipped=14)

But it gets crashed as follows when running with uftrace record.

$ uftrace record ./test_uuid.py
s...s.ss....ss...s.ss............................s.sss.s....
----------------------------------------------------------------------
Ran 60 tests in 0.236s

OK (skipped=14)
WARN: child terminated by signal: 11: Segmentation fault

@namhyung
Copy link
Owner Author

Some reference count errors are expected as it increases refcnt of code object for the symbol table. I guess it won't be a problem in practice unless the python code is extremely sensitive to the refcnt behavior.

@namhyung namhyung changed the title PoC: python function tracing support python function tracing support Feb 26, 2023
@namhyung
Copy link
Owner Author

Let's talk about them in separate issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants