-
Notifications
You must be signed in to change notification settings - Fork 431
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ptrace errors on linux kernel older than v4.7 #83
Comments
The above was for python3. With python2, the following error is shown:
|
adding |
What OS are you running? Can you enable logging and paste the output here (export RUST_LOG=info on osx/linux or set RUST_LOG=info on windows cmd.exe then run the command again I'm guessing it's failing/hanging to pause the process when getting samples. If you are on linux/osx it might be worth trying to run this again as sudo - just in case it's a permissions probleem |
It's an Ubuntu 16.04 docker container, running on a CentOS 7.4 host.
Here is the output of running Click to expand log
|
P.S. I can also reproduce it on the host machine itself (using python2). I don't have Running Click to expand log
|
|
The numpy thing is especially weird, I could see it failing consistently or not - but that shouldn't change depending on whether or not numpy is imported. We're using ptrace to suspend the process to get a consistent snapshot on linux - and it looks like that is what is failing Can you use strace or gdb on the process? They both also use ptrace on the process, and I'm hoping that they also fail in the same way. I'm assuming that you've set the docker container up for ptrace by adding the SYS_PTRACE capability or it wouldn't work without numpy - but you can verify this with Also, can you profile from the host os? (rather than profile inside the container, figure out the PID running on the host os and then pass that to py-spy) There are some more suggestions for ptrace permissions errors here if any of these apply: https://github.com/uber/pyflame/blob/master/docs/faq.rst#what-are-these-ptrace-permissions-errors I'm really not sure why this isn't working right now though =( |
Yes, the docker container is set up with I cannot run the process in the container and py-spy on the host, because I don't have root rights on the host (results in What I can do is run both on the host (no docker involved) and I've already posted results for that above in my second log post (that you might have missed because of the excessive length). If I run the script and py-spy on the host (note: no
I think it's still mysterious why the |
Sorry, didn't mean to close (but you can if you want to) |
Now back in the container: |
And |
For what it's worth, I ran in to this issue as well on my Linux box at home running Linux Mint 18, an Ubuntu 16.04 derivative. This is very fun and exciting in my opinion. Using |
I'm looking in to this a bit. I've figured out that the reason this is happening is that stopping some of the background threads that numpy starts up is not working. I believe these are threads that are started by whatever linear algebra library numpy uses, but I could be wrong about that. I'm going to continue looking into this to figure out what's going on. The line where it fails is nix::sys::ptrace::attach(tid)?. I've tried running as root and no dice; maybe there's a knob in sysctl I need to mess with. |
This happens with the simple multithreaded program below as well: import threading
import time
def do_nothing():
while True:
time.sleep(1)
threading.Thread(target=do_nothing, daemon=True).start()
while True:
time.sleep(1) Running with this command fails, whether
|
Hopefully I'll have some time to look at this some more tomorrow. I'm wondering if some flags to the
|
Thanks for looking into this @codypiersall ! Unfortunately - I still can't replicate, even running the same test script as you =(
I don't think it's the sysctl variable (running as root would fix otherwise). Can you check if anything else is attached to the process? Running |
I'll try this with Python 3.7 when I get home from work today; I think I was using 3.6 when I was doing these tests. It's very interesting that it's working on your computer but not mine! |
Digging into this a bit more, I think this may be a bug in my kernel version. After a circuitous trek through the internet, I found myself looking at the kernel changelogs for Ubuntu. On this page is this:
Anyway, I'm wondering whether this isn't a bug in py-spy at all, but in my kernel. I'm running 4.4.0, and I'm going to upgrade it right now.
|
Well, whadayaknow, it certainly seems like it was a bug in my kernel.
|
Maybe py-spy should detect when running on kernels <= 4.04, and not attempt to stop the threads? Maybe issue a warning? That way we can blame the user when stuff goes wrong, and they'll feel dumb when we say "why didn't you read the warning?" (Just kidding of course! Just pointing out that my idea has its own problems; the warning potentially wouldn't be very effective.) Another approach might be to send If we can come to an approach that sounds good I'd be happy to put together a PR for it. |
I see a similar problem.
|
This fix seems to resolve: benfred/remoteprocess#2 . On linux kernels older than v4.7, py-spy couldn't ptrace attach to multithreaded programs. Importing numpy spawns a thread, explaining the weird behaviour reported here. Fix will be in the next version |
Fix is in 0.3.1 |
Profiling this works fine:
But when I run
py-spy
on this, it no longer shows any output (clears the screen, then stays empty; no top view):Any ideas?
The text was updated successfully, but these errors were encountered: