-
-
Notifications
You must be signed in to change notification settings - Fork 561
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hangs encountered in IDAKLUJax
unit tests (test_jacrev_vmap
and others)
#3948
Comments
Temporary solution: wrap all classes and the methods with This resolves the |
To bring some rudimentary sense of the issue at hand, here's what I can see locally on an M-series macOS machine: Running the entire coverage suite with the JAX tests included:
And running it again, except |
IDAKLUJax
unit tests (test_jacrev_vmap
)IDAKLUJax
unit tests (test_jacrev_vmap
and others)
I saw this on python 3.9 as well |
@agriyakhetarpal To me it looks like the functions in there are parallel. So parallel tests with parallel code means a ton of extra threads. On my Mac those tests seem to use 4 threads each |
I want to help with this issue. Please let me know if you need another hand. |
Thanks, @prady0t, well, it's really just the attempt in #3948 (comment) that's helped with one out of the four tests, so we need to dig deeper since the tests are running in parallel which slows it down but also that they are really slow in serial execution too. I think this can be tackled at a later time since the tests pass, at least. I think you're doing #3940 with @lorenzofavaro and that is a higher-priority issue we need to tackle at this moment :) |
Oops, I just realised that I mentioned and I tagged the wrong person, I am sorry. By all means, please feel free to help out here, @cringeyburger! |
This was discussed in the GSoC meeting today for the ======================================================================= short test summary info ========================================================================
SKIPPED [1] tests/unit/test_expression_tree/test_operations/test_latexify.py:84: Only run for Linux
SKIPPED [1] tests/unit/test_solvers/test_idaklu_jax.py:91: Both IDAKLU and JAX are available
=================================================================== 1608 passed, 2 skipped in 56.63s ===================================================================
nox > Session unit was successful. and the trio of |
I don't think I have seen it recently either |
Perfect, closing |
PyBaMM Version
develop
Python Version
3.11.8
Describe the bug
The
test_jacrev_vmap
test case in theTestIDAKLUJax
class (present intests/unit/test_solvers/test_idaklu_jax.py
) hangs quite a lot during local development. It is one of the slowest tests to pass, to the point that coverage logging almost gets stuck indefinitely on 99% and that this test, in particular, takes time in several orders of magnitude more to complete when compared to the rest of the tests.This is most likely coming from the recent migration to using
pytest
for running the unit tests (#3857), which also brought support forpytest-xdist
for parallel execution of unit tests, where JAX-related unit tests take up a lot of time in CI in parallel mode.Here's an SVG from a profiling sample with the
pytest-profiling
plugin from @prady0t earlier in the#infrastructure
channel on Slack:Expand to view
which reveals that something is up with the JAX-related tests.
Steps to Reproduce
There isn't a better reproducer at this time, but to reproduce one can run
nox -s coverage
or itspytest --cov
equivalent in the root directory – it is a bit slower thannox -s unit
, but both of them seem to have the same issue.Relevant log output
No response
The text was updated successfully, but these errors were encountered: