Skip to content
This repository has been archived by the owner on Aug 2, 2023. It is now read-only.

Debugging pytests Using Subprocesses Fails #1548

Closed
schanzel opened this issue Jun 28, 2019 · 25 comments
Closed

Debugging pytests Using Subprocesses Fails #1548

schanzel opened this issue Jun 28, 2019 · 25 comments

Comments

@schanzel
Copy link

Environment data

  • VS Code version: 1.35.1
  • Extension version: 1.35.1
  • OS and version: macOS 10.14.5
  • Python version: 3.7.3
  • Relevant/affected Python packages and their versions: pytest 4.6.3
  • Jedi or Language Server? Jedi

Expected behaviour

Test results when running with debugger are identical to the results from a run without debugger.

Actual behaviour

Tests fail because subprocess can not be successfully executed when running tests with the debugger attached.

Steps to reproduce:

  1. Write a simple pytest (haven't tested with others) using subprocesses and it's shell argument set to True
    import subprocess
    def test_subprocess():
        result = subprocess.run(
            ' '.join(['echo', 'hello']),
            shell=True
        )
        assert result.returncode == 0
  2. Run that test using the test explorer without debugging
  3. Result is green
  4. Run the same test again with debugger
  5. Result is red because returncode is 2 (also stdout and stderr are empty)

Logs

Output for Python in the Output panel (ViewOutput, change the drop-down the upper-right of the Output panel to Python)

none

Output from Console under the Developer Tools panel (toggle Developer Tools on under Help; turn on source maps to make any tracebacks be useful by running Enable source map support for extension debugging)

none
@schanzel
Copy link
Author

schanzel commented Jul 1, 2019

Or do you guys think this would better be reported in microsoft/ptvsd?

@DonJayamanne DonJayamanne transferred this issue from microsoft/vscode-python Jul 1, 2019
@karthiknadig
Copy link
Member

@int19h
Copy link
Contributor

int19h commented Jul 1, 2019

Note that this is using subprocess directly, not multiprocessing. So there is no issue with forks etc.

I suspect this might have something to do with shell=True. @fabioz, do you recall if there are any issues propagating the child process exit code in that scenario, when process creation is detoured by pydevd?

@schanzel
Copy link
Author

schanzel commented Jul 2, 2019

@schanzel Try these instructions and see if they help. https://github.com/microsoft/ptvsd/blob/master/TROUBLESHOOTING.md#1-multiprocessing-on-linuxmac

Adding

import multiprocessing
multiprocessing.set_start_method('spawn', True)

did not help out, got the same result.

@fabioz
Copy link
Contributor

fabioz commented Jul 2, 2019

If you don't use the test explorer and just a debug run with the code below, do you still have the error?

import subprocess
assert subprocess.run('echo hello', shell=True).returncode == 0

As a note, on how this happens in mac: it actually does a _posixsubprocess.fork_exec, which should be monke-patched to call pydevd_monkey.create_fork_exec, but it shouldn't do anything in that case because it's not really calling python, only echo (so, if things worked as intended, nothing should really change... and as far as I know, the fork/exec approach should work without any problems even if there were threads in the previous running in the previous executable).

I've tested things here and I can reproduce it using the test runner (with a regular test run and with a debug test run) and I can't reproduce it without the test running (in a regular run), so, it seems like the issue isn't really in the debugger but in the test runner.

@schanzel can you confirm if that's also the case for you?

@fabioz
Copy link
Contributor

fabioz commented Jul 2, 2019

Actually, I'm sorry, I can't reproduce it with a regular test, just with a debug test (but it needs both the debug enabled and the test run).

Also, if shell=True is not passed and it's called as subprocess.run(['echo', 'hello']) it seems to work for me in debug mode (so, it really seems related to shell=True + test run + debugger).

@schanzel
Copy link
Author

schanzel commented Jul 9, 2019

@fabioz yes, this is exactly the combination in which the issue ocures.

@schanzel
Copy link
Author

@karthiknadig any news here yet?

@karthiknadig
Copy link
Member

@fabioz Any ideas on what needs to be done here?

@fabioz
Copy link
Contributor

fabioz commented Jul 11, 2019

@karthiknadig not really... this needs further investigation and I think it will take a while to properly debug why this may be happening -- I think it's more related to shell=True than the debugger or the test runner per se as in this case the monkey-patching related to the debugger isn't really taking place, although it could really be something in the environment or something the system isn't liking when doing the initial fork/exec due to the debugger and the test runner being there being there, but these are just far-fetched ideas, I think this will take a lot of experimentation to discover the actual culprit (my initial investigation just showed that things are apparently working as intended on the debugger and no mokey-patching is taking place, so, this needs to be planned for and would also need a deeper investigation on the test runner side as it just fails when both of those are used and shell=True -- my estimate is around 5 days).

p.s.: I've actually seen some strange interactions when shell=True already, so, my usual recommendation is using shell=False -- also see: https://stackoverflow.com/a/3172488/110451.

@awichmann-mintel
Copy link

awichmann-mintel commented Jul 11, 2019

FWIW, I took out shell=True (and switched around the first arg to be a list), and it solved my problem.

@schanzel
Copy link
Author

schanzel commented Jul 12, 2019

@awichmann-mintel yes that solves the issue, but interesting enough that the test results differ when debugging vs. a non-debug run.

Besides, I found another case in which debugging test runs leads to different results, even when the shell argument is False.
I have a use case here where we are analyzing metrics captured by sysstat and need to process them with sysstat's sadf tool.

apt-get install sysstat

Please consider the following example below.
Running this in a regular, non-debug test run and assuming the /tmp/test file is not existing, sadf reports a return code of 2. If it exists it would get converted and the subprocess would return 0.
While debugging, however, the call to sadf prints it's usage information into stderr and returns with 1 in either case, like no arguments where given to it whatsoever.

def test_sadf_output():
    result = subprocess.run(
        ["sadf", "-c", "/tmp/test"],
    )
    assert result.returncode == 0

Do you think this might be related?

@fabioz
Copy link
Contributor

fabioz commented Jul 12, 2019

I think that yes, it's possible that it's related...

One possibility is that the environment is not correct.

Can you run the following and attach the output in the case that it works and in the case that it doesn't work to see if it matches?

import os
import subprocess
env_copy = os.environ.copy()
for key, val in env_copy.items():
    
    print('%s (%s, %s)' % (key, type(key), type(val)))
    for v in val.split(os.pathsep):
        print('    %s' % (v,))
result = subprocess.run(["sadf", "-c", "/tmp/test"], env=env_copy)
assert result.returncode == 0

Another thing to check: if you pass the full path to sadf, does it work?

@schanzel
Copy link
Author

schanzel commented Jul 12, 2019

Sure, here you go!

I don't see any significant difference here, do you?

Failure (Debug) Case

platform linux -- Python 3.7.3, pytest-4.6.3, py-1.8.0, pluggy-0.12.0 -- /usr/local/bin/python
cachedir: .pytest_cache
rootdir: /workspaces/subprocess-test
collecting ... collected 1 item

tests/test_demo.py:: test_sadf_output FAILED                         [100%]

=================================== FAILURES ===================================
____________________________ test_sadf_output _____________________________

    def test_sadf_output():
        import os
        import subprocess
        env_copy = os.environ.copy()
        for key, val in env_copy.items():

            print('%s (%s, %s)' % (key, type(key), type(val)))
            for v in val.split(os.pathsep):
                print('    %s' % (v,))
        result = subprocess.run(["/usr/bin/sadf", "-c", "/tmp/test"], env=env_copy)
>       assert result.returncode == 0
E       assert 1 == 0
E         -1
E         +0

tests/test_demo.py:21: AssertionError
----------------------------- Captured stdout call -----------------------------
HTTPS_PROXY (<class 'str'>, <class 'str'>)
    http
    //host.docker.internal
    3128
HOSTNAME (<class 'str'>, <class 'str'>)
    f313a1f71768
PYTHON_PIP_VERSION (<class 'str'>, <class 'str'>)
    19.1.1
HOME (<class 'str'>, <class 'str'>)
    /root
GPG_KEY (<class 'str'>, <class 'str'>)
    ...
https_proxy (<class 'str'>, <class 'str'>)
    http
    //host.docker.internal
    3128
http_proxy (<class 'str'>, <class 'str'>)
    http
    //host.docker.internal
    3128
VSCODE_AGENT_FOLDER (<class 'str'>, <class 'str'>)
    /root/.vscode-server
LANG (<class 'str'>, <class 'str'>)
    C.UTF-8
SHELL (<class 'str'>, <class 'str'>)
    /bin/bash
PYTHON_VERSION (<class 'str'>, <class 'str'>)
    3.7.3
PWD (<class 'str'>, <class 'str'>)
    /root/.vscode-server/bin/2213894ea0415ee8c85c5eea0d0ff81ecc191529
HTTP_PROXY (<class 'str'>, <class 'str'>)
    http
    //host.docker.internal
    3128
VSCODE_INJECT_NODE_MODULE_LOOKUP_PATH (<class 'str'>, <class 'str'>)
    /root/.vscode-server/bin/2213894ea0415ee8c85c5eea0d0ff81ecc191529/remote/node_modules
VSCODE_LOGS (<class 'str'>, <class 'str'>)
    /root/.vscode-server/data/logs/20190712T120648
AMD_ENTRYPOINT (<class 'str'>, <class 'str'>)
    vs/server/remoteExtensionHostProcess
PIPE_LOGGING (<class 'str'>, <class 'str'>)
    true
VERBOSE_LOGGING (<class 'str'>, <class 'str'>)
    true
VSCODE_EXTHOST_WILL_SEND_SOCKET (<class 'str'>, <class 'str'>)
    true
VSCODE_HANDLES_UNCAUGHT_ERRORS (<class 'str'>, <class 'str'>)
    true
VSCODE_LOG_STACK (<class 'str'>, <class 'str'>)
    false
VSCODE_NLS_CONFIG (<class 'str'>, <class 'str'>)
    {"locale"
    "en","availableLanguages"
    {}}
VSCODE_IPC_HOOK_CLI (<class 'str'>, <class 'str'>)
    /tmp/vscode-ipc-cc828c52-9580-41d0-91db-cccbfa7516eb.sock
APPLICATION_INSIGHTS_NO_DIAGNOSTIC_CHANNEL (<class 'str'>, <class 'str'>)
    true
PATH (<class 'str'>, <class 'str'>)
    /root/.vscode-server/bin/2213894ea0415ee8c85c5eea0d0ff81ecc191529/bin
    /usr/local/bin
    /usr/local/sbin
    /usr/local/bin
    /usr/sbin
    /usr/bin
    /sbin
    /bin
PYTHONIOENCODING (<class 'str'>, <class 'str'>)
    UTF-8
PYTHONUNBUFFERED (<class 'str'>, <class 'str'>)
    1
PYTEST_CURRENT_TEST (<class 'str'>, <class 'str'>)
    tests/test_demo.py
    
    test_sadf_output (call)
----------------------------- Captured stderr call -----------------------------
Usage: /usr/bin/sadf [ options ] [ <interval> [ <count> ] ] [ <datafile> | -[0-9]+ ]
Options are:
[ -C ] [ -c | -d | -g | -j | -p | -x ] [ -H ] [ -h ] [ -T | -t | -U ] [ -V ]
[ -O <opts> [,...] ] [ -P { <cpu> [,...] | ALL } ]
[ -s [ <hh:mm[:ss]> ] ] [ -e [ <hh:mm[:ss]> ] ]
[ -- <sar_options> ]
--------------- generated xml file: /tmp/tmp-153gsTbmMj748i5.xml ---------------
=========================== 1 failed in 0.21 seconds ===========================

"Success" (non-debug) Case

platform linux -- Python 3.7.3, pytest-4.6.3, py-1.8.0, pluggy-0.12.0
rootdir: /workspaces/subprocess-test
collected 3 items / 2 deselected / 1 selected                                                                                                                                                                   

tests/test_demo.py F                                                                                                                                                                                      [100%]

=================================================================================================== FAILURES ====================================================================================================
_____________________________________________________________________________________________ test_subprocess_check _____________________________________________________________________________________________

    def test_sadf_output():
        import os
        import subprocess
        env_copy = os.environ.copy()
        for key, val in env_copy.items():
    
            print('%s (%s, %s)' % (key, type(key), type(val)))
            for v in val.split(os.pathsep):
                print('    %s' % (v,))
        result = subprocess.run(["/usr/bin/sadf", "-c", "/tmp/test"], env=env_copy)
>       assert result.returncode == 0
E       AssertionError: assert 2 == 0
E        +  where 2 = CompletedProcess(args=['/usr/bin/sadf', '-c', '/tmp/test'], returncode=2).returncode

tests/test_demo.py:21: AssertionError
--------------------------------------------------------------------------------------------- Captured stdout call ----------------------------------------------------------------------------------------------
HTTP_PROXY (<class 'str'>, <class 'str'>)
    http
    //host.docker.internal
    3128
LANG (<class 'str'>, <class 'str'>)
    en_US.UTF-8
HOSTNAME (<class 'str'>, <class 'str'>)
    f313a1f71768
AMD_ENTRYPOINT (<class 'str'>, <class 'str'>)
    vs/server/remoteExtensionHostProcess
GPG_KEY (<class 'str'>, <class 'str'>)
    ...
APPLICATION_INSIGHTS_NO_DIAGNOSTIC_CHANNEL (<class 'str'>, <class 'str'>)
    true
PWD (<class 'str'>, <class 'str'>)
    /workspaces/remote-container-test
HOME (<class 'str'>, <class 'str'>)
    /root
TERM_PROGRAM (<class 'str'>, <class 'str'>)
    vscode
TERM_PROGRAM_VERSION (<class 'str'>, <class 'str'>)
    1.36.1
https_proxy (<class 'str'>, <class 'str'>)
    http
    //host.docker.internal
    3128
http_proxy (<class 'str'>, <class 'str'>)
    http
    //host.docker.internal
    3128
VSCODE_IPC_HOOK_CLI (<class 'str'>, <class 'str'>)
    /tmp/vscode-ipc-cc828c52-9580-41d0-91db-cccbfa7516eb.sock
HTTPS_PROXY (<class 'str'>, <class 'str'>)
    http
    //host.docker.internal
    3128
TERM (<class 'str'>, <class 'str'>)
    xterm-256color
SHELL (<class 'str'>, <class 'str'>)
    /bin/bash
PYTHON_VERSION (<class 'str'>, <class 'str'>)
    3.7.3
SHLVL (<class 'str'>, <class 'str'>)
    1
PIPE_LOGGING (<class 'str'>, <class 'str'>)
    true
PATH (<class 'str'>, <class 'str'>)
    /root/.vscode-server/bin/2213894ea0415ee8c85c5eea0d0ff81ecc191529/bin
    /usr/local/bin
    /usr/local/sbin
    /usr/local/bin
    /usr/sbin
    /usr/bin
    /sbin
    /bin
PYTHON_PIP_VERSION (<class 'str'>, <class 'str'>)
    19.1.1
VERBOSE_LOGGING (<class 'str'>, <class 'str'>)
    true
_ (<class 'str'>, <class 'str'>)
    /usr/local/bin/pytest
LC_CTYPE (<class 'str'>, <class 'str'>)
    C.UTF-8
PYTEST_CURRENT_TEST (<class 'str'>, <class 'str'>)
    tests/test_demo.py
    
    test_sadf_output (call)
--------------------------------------------------------------------------------------------- Captured stderr call ----------------------------------------------------------------------------------------------
Cannot open /tmp/test: No such file or directory

@fabioz
Copy link
Contributor

fabioz commented Jul 12, 2019

Can you also provide it in the case that it does work?

@fabioz
Copy link
Contributor

fabioz commented Jul 12, 2019

Also, if you run that without the test running (in a __main__ module), but with the debugger, does it work?

@schanzel
Copy link
Author

Also, if you run that without the test running (in a __main__ module), but with the debugger, does it work?

Yes, running that with a debugger in a __main__ module, outside of a test context works just fine.

Can you also provide it in the case that it does work?

I attached the output to my previous comment.

@fabioz
Copy link
Contributor

fabioz commented Jul 12, 2019

Well, there are many differences actually... (i.e.: TERM, LANG, PWD, LC_CTYPE, PYTHONIOENCODING, etc.).

It may be interesting to create a copy of the env where it works and use it in the case where it doesn't work to see if it makes a difference.

You can just do a print(repr(environ)) in the case that it works and then pass that as a parameter in the case where it doesn't work.

@schanzel
Copy link
Author

schanzel commented Jul 12, 2019

Tried that but got the exact same result (return code 1, usage information printed to stderr).

env = {
    'HTTP_PROXY': 'http://host.docker.internal:3128',
    'LANG': 'en_US.UTF-8',
    'HOSTNAME': 'b7118041f6b2',
    'AMD_ENTRYPOINT': 'vs/server/remoteExtensionHostProcess',
    'APPLICATION_INSIGHTS_NO_DIAGNOSTIC_CHANNEL': 'true',
    'PWD': '/workspaces/remote-container-test',
    'HOME': '/root',
    'TERM_PROGRAM': 'vscode',
    'TERM_PROGRAM_VERSION': '1.36.1',
[...]
    'PYTHONIOENCODING': 'UTF-8',
    'PYTHONUNBUFFERED': '1',
    'LC_CTYPE': 'C.UTF-8'
}

def test_sadf_output():
    import os
    import subprocess

    result = subprocess.run(["sadf", "-c", "/tmp/test"], env=env)
    assert result.returncode == 0

@fabioz
Copy link
Contributor

fabioz commented Jul 12, 2019

Can you try one more thing there? In pydev_monkey.py, can you put a return as the first statement in the functions: patch_new_process_functions and patch_new_process_functions_with_warning to see if it makes a difference?

To find where that module is in your case you should be able to do:

from _pydev_bundle import pydev_monkey
print(pydev_mokey)

@schanzel
Copy link
Author

That seems to actually help! With these changes I get the correct result using a debug test run.

@fabioz
Copy link
Contributor

fabioz commented Jul 12, 2019

I'll have to investigate the test runner... it's possible that it also has some patching that's conflicting with the debugger patching (as that same code works when the test runner is not being used).

@schanzel
Copy link
Author

As a side note: The shell=True case also works correctly with these early returns.

@fabioz
Copy link
Contributor

fabioz commented Jul 12, 2019

I found the culprit... the problem is that ptvsd did a different patching for quoting args for the multiprocessing support than the one in pydevd and it applied the patching without checking if it was actually spawning a new python process.

The difference from the test running is that the test runner always enables multiprocess in the flags, whereas in the regular run it's not enabled (in which case the faulty code didn't actually execute) and shell=True changes things so that the command line itself was different, which is why it could get to that condition or not (the patching was dependent on what was passed in the command line).

I'll provide a fix shortly.

@schanzel
Copy link
Author

@fabioz Hey thats awesome, thanks for the fast help!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants