Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mark time spent in 'poll()' as idle #13

Closed
benfred opened this issue Sep 6, 2018 · 8 comments
Closed

Mark time spent in 'poll()' as idle #13

benfred opened this issue Sep 6, 2018 · 8 comments

Comments

@benfred
Copy link
Owner

benfred commented Sep 6, 2018

From HN:

Also, it would be nice to somehow eliminate time spent in poll() from the results. I'm profiling a server process and 99.9% of the time is spent in a poll function. Perhaps there could be an option to disregard time spent in system calls rather than user code. Most of the time I'm interested in profiling only user code.

@benfred
Copy link
Owner Author

benfred commented Sep 6, 2018

We will want better idle detection eventually, but in the meantime adding poll() to here https://github.com/benfred/py-spy/blob/master/src/stack_trace.rs#L67 seems like a good quick fix

@willstott101
Copy link

First of all, great project. Secondly in our framework the vast majority of a process' time is spent in the poll function from zmq. .../dist-packages/zmq/sugar/poll.py:99. I don't know much about this problem-space from a technical POV but I assume it would be near-impossible to get it right by default every time. Perhaps a commandline option to specify functions that are known to be interesting/uninteresting could be nice?

Users would at least know about the possibility of false-positives if they're setting the values themselves.

--include mypkg/*
--exclude poll()
--exclude zmq/*/poll()

@balta2ar
Copy link

balta2ar commented Sep 10, 2018

@benfred Could you please update the HN link? it requires login now.
Here it goes: https://news.ycombinator.com/item?id=17928883

benfred added a commit that referenced this issue Sep 11, 2018
This removes some common poll calls from polluting the results. Code from
zmq/gevent/tornado/asyncore should be removed now.

#13
@benfred
Copy link
Owner Author

benfred commented Sep 11, 2018

@willstott101 I think it's a good idea to add some configuration options to exclude certain patterns - it's impossible for me to get this right 100% of the time, and I don't get enough information from just reading the pyinterpreterstate to figure out if the thread is actually idle or not

I'm thinking of adding a config file instead of command line options (so that you don't need to add the command line options everytime). Like if there is a .pyspyrc file in the current directory or your home directory then load exclusions from it, so that this config will persist across sessions.

I'm also working on some new visualization code that should let you filter/zoom in easier.

In the meantime I've marked the poll function from zmq/tornado/gevent/asyncore as being idle: 23c0796 . Will be in the next release

@balta2ar I updated the hn link =) didn't realize it was a reply link

@willstott101
Copy link

Great, thanks very much. Although I'm thinking about using this as a subprocess in which case the commandline options make a lot of sense... perhaps both would be doable <3

I might have a hack about trying to create something for this issue if you're interested - although I wouldn't rely on myself to produce anything releasable from the off.

Do you have any idea what that config might look like? Those glob-style things I showed with a special meaning for brackets to denote a function name I quite liked.

@nziebart
Copy link

I would like to understand this issue a bit more. I am seeing the following functions being marked as active:

  • time.sleep() (for some reason the sleep function itself doesn't appear in the stacktrace, the last frame is the calling function with the line number corresponding to where sleep is called)
  • this line in tornado (it looks like there is logic here that is supposed to handle this, but again the poll function doesn't show up in the stacktrace, only the calling function does)

I guess my main question is why these functions don't actually appear in the stack trace (thus eluding any attempt to special case them)

@EmreAtes
Copy link

Another function can be added to the default ignore list

  • The wait function in eventlet here
    Also, a configuration file would be very helpful for people without a rust compiler.

@benfred
Copy link
Owner Author

benfred commented Sep 28, 2018

@nziebart : we don't track into native code yet, so we show the time spent as being in the line of python code that called the native extension (rather than the native extension itself). For both of your cases I believe the functions are native code, so you're hitting this issue: #2

I've also created an issue to track the config file #51 . I'm still not sure the best syntax for the exclusion, though the glob style @willstott101 proposed seems like it could work pretty well

@benfred benfred closed this as completed Sep 28, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants