Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

“Windows fatal exception: access violation” on Windows using pytest #561

Closed
khaeru opened this issue Jan 11, 2020 · 12 comments
Closed

Comments

@khaeru
Copy link

khaeru commented Jan 11, 2020

At iiasa/ixmp#229 and iiasa/ixmp#247 we are trying to deal with noisy console messages from JPype that show up on Windows when running our pytest suite. See iiasa/ixmp#229 for lots of detail; the short version is that they look like:

Windows fatal exception: access violation

Current thread 0x00000864 (most recent call first):
  File "C:\Miniconda36-x64\lib\site-packages\jpype\_core.py", line 227 in startJVM
  File "C:\projects\ixmp\tests\backend\test_jdbc.py", line 18 in test_gh_229
  File "C:\Miniconda36-x64\lib\site-packages\_pytest\python.py", line 166 in pytest_pyfunc_call
  File "C:\Miniconda36-x64\lib\site-packages\pluggy\callers.py", line 187 in _multicall
  File "C:\Miniconda36-x64\lib\site-packages\pluggy\manager.py", line 81 in <lambda>

To be clear, our Java code and JPype function correctly and our test suite succeeds; the issue is that many (many…) lines of such messages clutter the pytest output, making it impossible to read.

The messages are triggered by many entry points in our Java code, but I have tried to isolate a simplest-possible case in iiasa/ixmp#247 and (in vain) to identify the cause. Details follow.

  1. As of the latest commit, the AppVeyor install script looks like this: https://github.com/iiasa/ixmp/blob/1079c8e08ec47e01819bfe0c05b430ff20a82aa4/.appveyor.yml#L27-L31
    …i.e. JPype1 master is installed by building the source directly from GitHub. Then the script ci/appveyor-gh_229.py is run.
  2. The build log is here; the output from the script is at lines 739–754. Note that the “Windows fatal exception” message does not appear.
  3. Then, the AppVeyor test script runs just one test, a new one that I have added to our pytest suite for debugging this issue: https://github.com/iiasa/ixmp/blob/1079c8e08ec47e01819bfe0c05b430ff20a82aa4/.appveyor.yml#L39-L41
    This test function is identical to appveyor-gh_229.py: https://github.com/iiasa/ixmp/blob/1079c8e08ec47e01819bfe0c05b430ff20a82aa4/tests/backend/test_jdbc.py#L10-L20
  4. The output of this pytest invocation is at lines 972–1063 of the same build log. Note that there is no exception of any sort raised; the test passes. But see lines 981–1020 for the “Windows fatal exception” message. When running the full test suite, there are dozens of similar, ~30-line messages.

At this point, I am stumped; I do not know why the messages appear in step 3/4 but not in 1/2; nor can I find any information about how to further debug this issue.

cc @zikolach

@khaeru khaeru changed the title Windows fatal exception: access violation on Windows using pytest “Windows fatal exception: access violation” on Windows using pytest Jan 11, 2020
@Thrameos
Copy link
Contributor

Thanks for the bug report. I have noticed similar errors but have as of yet no clue how to track them down. To first order I am guessing they are some problem with the shutdown sequence between tests.

I haven't really figured out if pytest is running each test file in a separate JVM with a separate process. Given the access fault I can't believe that it is actually proceeding so it must have finished the test, reported and then crashed in clean up. But it is very unclear.

I know that under nose all tests were run in the same JVM so it would proceed without any shutdowns. Perhaps there is so switch to make those a fail so that it can be captured. Assuming that it the case, then the likely way to replicate it is to write a script that executes a test manually, calls shutdown, and then attempts to access resources. I have code on all the entry points that is supposed to route those calls to exceptions, but it one is missing it could explain what is going on.

@khaeru
Copy link
Author

khaeru commented Jan 12, 2020

Thanks for the quick reply!

I am guessing they are some problem with the shutdown sequence between tests.

I haven't really figured out if pytest is running each test file in a separate JVM with a separate process. Given the access fault I can't believe that it is actually proceeding so it must have finished the test, reported and then crashed in clean up.

Okay, that is something I can investigate in a superficial way by skimming the pytest code that executes the tests/cleans up—see if it forks or anything like that. I will look when I have a moment to do so.

However, per “I can't believe that it is actually proceeding”, see this comment, in particular the messages that occur when test_default_version is run:

Current thread 0x00000a80 (most recent call first):
  File "c:\projects\ixmp\ixmp\backend\jdbc.py", line 313 in get
  File "c:\projects\ixmp\ixmp\backend\base.py", line 25 in __call__
  File "c:\projects\ixmp\ixmp\core.py", line 358 in _backend
  File "c:\projects\ixmp\ixmp\core.py", line 611 in __init__
  File "C:\projects\ixmp\tests\test_core.py", line 33 in test_default_version

That test looks like this: https://github.com/iiasa/ixmp/blob/c443e27c8e4227fef15b3461565d82f965da0561/tests/test_core.py#L32-L34

def test_default_version(test_mp):
    scen = ixmp.Scenario(test_mp, *can_args)
    assert scen.version == 2

The call line 313 in get is somewhere up the call stack from ixmp.Scenario(). But ixmp.Scenario() completes successfully and returns the correct value so that the assertion in the second line of the test passes.

To me, this implies that line 313 in get “proceeds” successfully and the JVM continues running afterwards. But again, I am ignorant of the internals.

@Thrameos
Copy link
Contributor

Sorry didn't mean to confuse you. By saying that I can't believe it is proceeding, I simply mean "Windows fatal exception" implies that something died (otherwise it isn't really fatal). Thus either the process is dying (implies pytest is forking), or a thread is dying (implies pytest is launching threads of some kind), or last possibility someone is eating a fatal error. So lets look at each possibility.

The first one, process dying on a fork, would imply the shutdown procedure as it is the least clean routine in the code and most likely to take a fall. I think this is easy to test if it is happening. Just create two test files. The first creates an object and sets a static field. The second creates the object again and checks the static. If the second gets the value the first set then there is no shutdown issue. Alternatively you can add a RuntimeException in shutdown to see if it triggers, but that may have other side effects.

The second one, thread dying on the exception, is possible but harder imagine. The expected failure pass on threads is for a thread to attempt to use a JVM resource that is outside the scope. We used to get these all the time due to unattached threads, but I fixed that be ensuring all threads attach automatically. In this scenario pytest creates a thread resource, runs the test, captures the output, then kills the thread, the thread triggers a fatal exception but as the output of the test is already captured nothing actually dies as the thread handler catches and eats the exception, the output value gets passed through and we see "PASS". Maybe this one is more likely than I give credit.

The third one, eating a fatal on the main thread, should be possible to confirm by modifying pytest at the report point to see if we can trigger capture it or letting it pass through. It would really violate the spirit of a test for an exception to get eaten so I am putting this as low likelihood.

There is one other possibility which is yanked resources through I don't see how we could trigger that one given what pytest is supposed to be doing. If you attempt to kill a resource that is currently living on the stack then that will give a fault when the return path cross it. For example, if you install a proxy and call it and the proxy calls shutdown then the proxy will hit a fatal error when the returns through the now dead JVM. (Yes I try to put guards such that the shutdown doesn't actually happen until the proxy returns, but these cases are often very hard to replicate.) Similarly if we where to take a proxy and somehow we lost a reference to an object that was on the stack we would get fatal errors on the return. The good news is usually these are really, really fatal as in segfault. I am not sure how pytest could trigger this bad state, but I can't rule it out.

I will provide whatever assistance that I can, though my plate is pretty full in the short term. I have three major pull requests on the burner right now. That is finish classhints so that people can actually test 0.8.0alpha with its 3 times speed increase, finish the coverage so that I can verify all the paths are covered in 0.7 and thus detect bad behavior in 0.8, and last rework the memory buffer transfer scheme to improve numpy integration.

@Thrameos
Copy link
Contributor

I looked at it further. The place that is causing it in the JPype test appears indeed to be something using threads. I moved the Teardown from test_proxy_multithreaded.py and it is clearly coming from threading. It is still a mystery who is eating the fatal error. I will try to decompose the test into pieces so I can isolate it further.

@khaeru
Copy link
Author

khaeru commented Jan 12, 2020

I looked at it further.

Thanks so much for the explanation and investigation! As I guessed, you zeroed in on it much faster than I possibly could.

I'll make time tomorrow to look at how pytest threads, as promised.

@Thrameos
Copy link
Contributor

Thrameos commented Jan 12, 2020

Interesting. If I copy the whole contents of the test into a project it doesn't cause any errors, but if I call the exact same code using pytest it always creates an error. Not sure how pytest triggers it.

try "python -m pytest test.py" vs "python test.py". Not sure what gives here. But if we can find a way to promote it to actually fatal I can trace it.

import jpype
from jpype import *
import time
import unittest

class C:
    def testMethodInt(self):
        return 5

    def testMethodObject(self):
        return JPackage("jpype.proxy").ReturnObject(3)

    def testMethodString(self):
        return "Hallo"

    def testMethodList(self, noOfValues):
        responses = list()
        for i in range(0, noOfValues):
            responses.append(JPackage("jpype.proxy").ReturnObject(i))
        return java.util.Arrays.asList(responses)


class ThreadCallbackImpl:
    def __init__(self):
        self.values = []

    def notifyValue(self, val):
        self.values.append(val)

class TestProxy(unittest.TestCase):
    def setUp(self):
        pass

    def testFail(self):
        jpype.startJVM(classpath="test/classes", convertStrings=False)
        package = JPackage("jpype.proxy")

        proxy = JProxy(package.TestInterface4, inst=C())
        executor = package.ProxyExecutor(1)
        executor.registerProxy(proxy, 10)
        executor.runExecutor()
        result = executor.waitForExecutedTasks()
        expected = executor.getExpectedTasks()
        print(result, expected)
        executor.shutdown()

        proxy = JProxy(package.TestInterface4, inst=C())
        executor = package.ProxyExecutor(10)
        executor.registerProxy(proxy, 100)
        executor.runExecutor()
        result = executor.waitForExecutedTasks()
        expected = executor.getExpectedTasks()
        print(result, expected, "Executed Tasks should be the same.")
        executor.shutdown()

        executor = package.ProxyExecutor(10)
        for i in range(0, 5):
            proxy = JProxy(package.TestInterface4, inst=C())
            executor.registerProxy(proxy, 15)
        executor.runExecutor()
        result = executor.waitForExecutedTasks()
        expected = executor.getExpectedTasks()
        print(result, expected, "Executed Tasks should be the same.")
        executor.shutdown()

        jpype.shutdownJVM()

if __name__ == "__main__":
    tp = TestProxy()
    tp.testFail()

@Thrameos
Copy link
Contributor

Adding this to the offending test removes the error.

       import faulthandler
       faulthandler.enable()
       faulthandler.disable()

It appears the Python sets up a handler of some kind and it can be suppressed. Pytest is enabling logging on this. If we want to resolve it we need to find a way to promote the problem to fatal so that we can capture the location of the fault.

@Thrameos
Copy link
Contributor

@Thrameos
Copy link
Contributor

Okay I have a potential theory for the origin of these messages. As noted in the developers guide Java uses Access Exceptions for its own purposes. The first thing that it does when starting up the JVM is triggering an exception to check if the hook is installed properly. But it appears possible that pytest installs its own hooks which interfere with the Java hooks. The reason nothing happens is that it goes to pytest, dumps the exception and then transfers to the regular Java handler than suppresses the whole thing. That would explain how a fatal error has no effect.

If this is the case then just adding the fault handler lines to the startup routine in the pytest suite will solve the issue.

@Thrameos
Copy link
Contributor

Okay I found a way to replicate it outside of pytest. Adding these lines to any jpype file and it will trigger the error whenever the jvm is start or a thread is interrupted.

        import faulthandler
        import sys
        faulthandler.enable(file=sys.stdout)

I believe that disabling and enabling the faulthandler will remove these harmless errors.

@khaeru
Copy link
Author

khaeru commented Jan 13, 2020

Sure enough, there was a formerly-separate pytest-faulthandler plugin that was integrated into core in pytest 5.0: https://docs.pytest.org/en/latest/usage.html#fault-handler

I will use the suggested flag to disable it and see if that resolves the issue.

@khaeru
Copy link
Author

khaeru commented Jan 13, 2020

I will use the suggested flag to disable it and see if that resolves the issue.

It seems to work! See e.g. https://ci.appveyor.com/project/iiasa/ixmp/builds/30070203 —the entire test suite runs without any occurrence of the messages. I have some additional experimentation to do on how to selectively disable this only on Windows, but beyond that I consider this issue resolved.

Please close it if you like—or leave open if there is any follow-up action in JPype docs/internals that you would like to make. Again, many thanks for the quick response.

P.S. looking at the faulthandler docs:

This module contains functions to dump Python tracebacks explicitly, on a fault, after a timeout, or on a user signal. Call faulthandler.enable() to install fault handlers for the SIGSEGV, SIGFPE, SIGABRT, SIGBUS, and SIGILL signals.

Per your comment “a fatal error has no effect,” I interpret the docs as meaning that a signal is sent, causing the Python faulthandler to print a message; but then (per your comment) Java receives the signal, decides that it is neither an error, nor fatal (because intentional)—so it happily proceeds.

ap-- added a commit to Bayer-Group/paquo that referenced this issue Jul 30, 2020
The pytest>=5 output is spammed with errors when testing code
using jpype. These errors can be ignored.

todo: disable faulthandler only on windows?

See: jpype-project/jpype#561
MrFizzyBubbs added a commit to MrFizzyBubbs/pymafia that referenced this issue Apr 7, 2023
commit fb4924486ff68ff8c424fe01364ebe39597e3a27
Author: MrFizzyBubbs <82386733+MrFizzyBubbs@users.noreply.github.com>
Date:   Thu Apr 6 13:35:40 2023 -0500

    Fix most failing property tests

commit a9b48c19118682cf5f00b7c3cd54523f2f129900
Author: MrFizzyBubbs <82386733+MrFizzyBubbs@users.noreply.github.com>
Date:   Thu Apr 6 12:46:24 2023 -0500

    Update slot.py

commit 159d4df9fb70462650f95cb16328f6e3e02b8a23
Author: MrFizzyBubbs <MrFizzyBubbs@protonmail.com>
Date:   Thu Apr 6 07:54:19 2023 -0500

    Test properties

commit c3a3819db7dff039591ef215b0ddd749945106bf
Author: MrFizzyBubbs <MrFizzyBubbs@protonmail.com>
Date:   Thu Apr 6 07:34:11 2023 -0500

    Test init default value

commit 40bcae1bb283ff99e8b7588e72e2c411104f38ba
Author: MrFizzyBubbs <82386733+MrFizzyBubbs@users.noreply.github.com>
Date:   Wed Apr 5 07:23:26 2023 -0500

    Reduce repeated data

commit 5670454800ba493e4ddda7889f9dd85508d3be52
Author: MrFizzyBubbs <82386733+MrFizzyBubbs@users.noreply.github.com>
Date:   Tue Apr 4 17:08:27 2023 -0500

    Remove Location xfail

commit 1277170c2cf83882a220c2759f8c71dcfe2c174d
Author: MrFizzyBubbs <82386733+MrFizzyBubbs@users.noreply.github.com>
Date:   Tue Apr 4 17:02:39 2023 -0500

    Use "J" prefix for java classes

commit e8aab153e3cbd40b60b0047c686a58b687523f34
Author: MrFizzyBubbs <MrFizzyBubbs@protonmail.com>
Date:   Tue Apr 4 10:51:02 2023 -0500

    Fix slot

commit d1b91145c3f23e37f8033ecca3d0c8d13cf53a0f
Author: MrFizzyBubbs <MrFizzyBubbs@protonmail.com>
Date:   Sun Apr 2 11:48:49 2023 -0500

    One assert per test

commit 1506055f28d0971ba5c9f3874090bc802a647cbc
Author: MrFizzyBubbs <MrFizzyBubbs@protonmail.com>
Date:   Sat Apr 1 19:06:50 2023 -0500

    Fix thrall

commit e3cafc5908eef0721c501cc75c9f06560fedccd3
Author: MrFizzyBubbs <MrFizzyBubbs@protonmail.com>
Date:   Sat Apr 1 09:35:09 2023 -0500

    Create test_datatypes.py

commit e81871ac9278dc2e9bf9d517b081765744913594
Author: MrFizzyBubbs <MrFizzyBubbs@protonmail.com>
Date:   Sat Apr 1 07:50:18 2023 -0500

    Update vykea.py

commit 77280b02773fd78209e4ef768cff2630be9b7176
Author: MrFizzyBubbs <MrFizzyBubbs@protonmail.com>
Date:   Sat Apr 1 07:49:24 2023 -0500

    Update black

commit b3999e7feebbd3390fecee17e5dba5443f2c6d45
Author: MrFizzyBubbs <MrFizzyBubbs@protonmail.com>
Date:   Sat Apr 1 07:32:08 2023 -0500

    Use protocol_map to monkey-patch __hash__

commit f23429e2e0bd6ebe397e1eddd7833e071e239565
Author: MrFizzyBubbs <MrFizzyBubbs@protonmail.com>
Date:   Fri Mar 31 17:36:21 2023 -0500

    Clarify mafia datatypes name

commit 288a8c87c84454ef1aab5b0d204d640aaca82c8c
Author: MrFizzyBubbs <MrFizzyBubbs@protonmail.com>
Date:   Fri Mar 31 17:22:31 2023 -0500

    Test datatype properties

commit 75e3607700981d1c02bcd42a482d39c883de1111
Author: MrFizzyBubbs <MrFizzyBubbs@protonmail.com>
Date:   Fri Mar 31 11:17:17 2023 -0500

    Include name in datatype dunder methods

commit 0655bb98e851e8a196a500e3e6e6c93b117aa32f
Author: MrFizzyBubbs <MrFizzyBubbs@protonmail.com>
Date:   Fri Mar 31 10:56:59 2023 -0500

    Create test_monster.py

commit 7fe0c03983d52f2bb330304318ad88f85f37f033
Author: MrFizzyBubbs <MrFizzyBubbs@protonmail.com>
Date:   Fri Mar 31 10:50:36 2023 -0500

    Update test_location.py

commit a2a95e32c1b0d59c19d4b667b79b9c0852f82212
Author: MrFizzyBubbs <MrFizzyBubbs@protonmail.com>
Date:   Fri Mar 31 07:13:25 2023 -0500

    Fix merged force push

commit b5c2a013b596f2469dc1fb12159e682a4d66d126
Merge: 7843874 5b6c823
Author: MrFizzyBubbs <MrFizzyBubbs@protonmail.com>
Date:   Fri Mar 31 07:10:40 2023 -0500

    Merge branch 'main' into pytest

commit 5b6c823
Author: MrFizzyBubbs <MrFizzyBubbs@protonmail.com>
Date:   Wed Mar 29 21:59:24 2023 -0500

    Use proxy to raise mafia errors

commit 784387440d83904abae0dde5642376dc6b0f53c2
Author: MrFizzyBubbs <MrFizzyBubbs@protonmail.com>
Date:   Thu Mar 30 07:29:40 2023 -0500

    Create test_location.py

commit 1ca3258a1b99dca76f7768718e09f1a5ce450c70
Author: MrFizzyBubbs <MrFizzyBubbs@protonmail.com>
Date:   Thu Mar 30 07:25:26 2023 -0500

    Create test_familiar.py

commit 379b38c59ae88ff32f27e9f39a1fcb1197b1008d
Author: MrFizzyBubbs <MrFizzyBubbs@protonmail.com>
Date:   Thu Mar 30 07:23:01 2023 -0500

    Create test_element.py

commit 4ebd380570b412ac062a983f944406bd10ff598a
Author: MrFizzyBubbs <MrFizzyBubbs@protonmail.com>
Date:   Thu Mar 30 07:17:19 2023 -0500

    Create test_effect.py

commit 428aff50a74aa16562fc87e58f740199543ba17c
Author: MrFizzyBubbs <MrFizzyBubbs@protonmail.com>
Date:   Tue Mar 28 17:04:37 2023 -0500

    Create test_coinmaster.py

commit daee2feaa39d24131a857f3454a98c565eecbdd0
Author: MrFizzyBubbs <MrFizzyBubbs@protonmail.com>
Date:   Tue Mar 28 13:20:24 2023 -0500

    Resolve Windows access violation

    See jpype-project/jpype#561

commit 20d850c9795041f117feff78d44af6585520f058
Author: MrFizzyBubbs <MrFizzyBubbs@protonmail.com>
Date:   Mon Mar 27 14:25:00 2023 -0500

    Create test_class.py

commit 4bebf32a1e625054db74f9ac35c05707c7ca2a8c
Author: MrFizzyBubbs <MrFizzyBubbs@protonmail.com>
Date:   Mon Mar 27 14:12:31 2023 -0500

    Create test_bounty.py

commit a7fabed4af1d57a2125189329f197410d7c90503
Author: MrFizzyBubbs <MrFizzyBubbs@protonmail.com>
Date:   Fri Mar 24 15:42:19 2023 -0500

    Create test_item.py

commit 99fbaa7
Author: MrFizzyBubbs <MrFizzyBubbs@protonmail.com>
Date:   Wed Mar 29 21:59:24 2023 -0500

    Use proxy to raise mafia errors
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants