Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In dependent project, ak.layout fails to convert to ak::Content. Update pybind11! #483

Closed
raymondEhlers opened this issue Oct 6, 2020 · 19 comments · Fixed by #485
Closed
Labels
bug (unverified) The problem described would be a bug, but needs to be triaged

Comments

@raymondEhlers
Copy link
Contributor

raymondEhlers commented Oct 6, 2020

I've been working on some fastjet bindings with an awkward1 interface based on pybind11 (I'm aware of pyjet and the SWIG bindings, but for various reasons, this was a better approach for me. I'm not linking them here because they're early and ugly and not really ready for anyone else, but the repo can be found on my profile if it's somehow helpful). I have an early version working on macOS, but when I moved to linux (ubuntu 18.04 for what it's worth, but I'm working in a virtualenv), it throws a TypeError whenever I try to pass an array layout to c++. For a trivial function which just returns what it receives, on linux I get

Python 3.6.9 (default, Jul 17 2020, 12:50:27)
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import awkward1 as ak
>>> import pybind11_awkward as pba
>>> pba.awkward_test(ak.Array([1,2,3]).layout)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: awkward_test(): incompatible function arguments. The following argument types are supported:
    1. (arr: awkward::Content) -> awkward::Content

Invoked with: <NumpyArray format="l" shape="3" data="1 2 3" at="0x000001be97b0"/>

On macOS, I get:

Python 3.8.5 (default, Jul 21 2020, 10:48:26)
[Clang 11.0.3 (clang-1103.0.32.62)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import awkward1 as ak
>>> import pybind11_awkward as pba
>>> pba.awkward_test(ak.Array([1,2,3]).layout)
In function
<NumpyArray format="l" shape="3" data="1 2 3" at="0x7ff17b283a00"/>

Other functions seem to work fine - it's apparently something about passing awkward arrays. I'm using awkward 0.3.1.

I'm a bit stuck on how to proceed. I recognize there's a huge phase space where I could have gone wrong, so I made a quick repository that reproduces the issue, and I tried to keep as much consistent as possible. It's available here. You can install it with poetry install, and then run test_simple.py. (In principle, it should work with pip using PEP 517, but my build script isn't quite right, sorry). I'm not 100% sure that this is actually an awkward1 issue (there are still some differences in the setups), but it seems like a reasonable place to start.

As always, help or suggestions are greatly appreciated! Thanks!

@raymondEhlers raymondEhlers added the bug (unverified) The problem described would be a bug, but needs to be triaged label Oct 6, 2020
@jpivarski
Copy link
Member

@veprbl Is this similar or different from what you were working on? (Just asking, because I remember you were linking into Awkward with MacOS.)

@veprbl
Copy link
Contributor

veprbl commented Oct 7, 2020

@raymondEhlers Does it work if your simple function accepts an ak::NumpyArray?

@raymondEhlers
Copy link
Contributor Author

Hi @veprbl , I tried with (I dropped the original awkwardTest function here for clarity):

std::shared_ptr<ak::NumpyArray> awkwardTestNumpyArray(const std::shared_ptr<ak::NumpyArray> & arr)
{
    std::cout << "In function for NumpyArray\n";
    return arr;
}

// Bindings
PYBIND11_MODULE(_src, m) {
  // Awkward array
  // Ensure dependencies are loaded.
  py::module::import("awkward1");

  m.def("awkward_test_numpy_array", &awkwardTestNumpyArray, "arr"_a, "...");
}

and called it with:

pba.awkward_test_numpy_array(ak.Array([1,2,3]).layout)

it works on macOS, but unfortunately it still fails on linux

@veprbl
Copy link
Contributor

veprbl commented Oct 7, 2020

Seems like this could be a variant pybind/pybind11#912, which should not be an issue on gcc/stdlibc++. Do you have an ability to recompile with a custom pybind? I would try removing this clause https://github.com/pybind/pybind11/blob/00edc3001bb10d6e6d4669e385ee43996e674b08/include/pybind11/detail/internals.h#L56-L60 and keeping the #else one instead.

@raymondEhlers
Copy link
Contributor Author

One more follow up: I was concerned that my reproducer repository was somehow the source of the issue, so I tried running the dependent-project from this repository. (I built my package based on it as an example, so they're not entirely independent). It took a little hacking to get running (kernels moved, some json issues), but in any case, I get the same issue

@raymondEhlers
Copy link
Contributor Author

Thanks for the suggestion! Yeah, I can recompile with a custom pybind11. To be clear, I need the same version of pybind11 for awkward1 and my package, right? So I need to modify it for both and recompile them? Or just for my package?

@raymondEhlers
Copy link
Contributor Author

After modifying pybind11 for both awkward1 and my package, and then recompiling everything, it works! (I first tried the lazy approach of just modifying pybind11 for my package, but it didn't work - in retrospect, not super surprising). Thanks for your help! How do we proceed from here?

@veprbl
Copy link
Contributor

veprbl commented Oct 7, 2020

@raymondEhlers I would report to the pybind11 upstream. I suppose this involves preparing an MRE that doesn't involve awkward1.

@raymondEhlers
Copy link
Contributor Author

Okay, I see. I'm not sure I quite understand it enough to reproduce without awkward yet, but I'll read the code more closely and try the linked pybind11 issue and see what I can come up with. Thanks!

@raymondEhlers
Copy link
Contributor Author

I looked into this a bit, but following the example in that pybind11 issue, I wasn't able to reproduce it without awkward1. Getting into controlling the exporting of symbols, etc, is a bit out of my depths, so I'm sure I'm overlooking something. However, as a simpler solution, I tried updating to pybind 2.5.0, and it seems to work correctly.

I'd like to run a few more tests to be certain that I've 100% cleaned up the old artifacts and that it's truly working, but so far so good. @jpivarski would updating pybind11 to 2.5.0 be possible?

@jpivarski
Copy link
Member

@jpivarski would updating pybind11 to 2.5.0 be possible?

Absolutely. Is that the latest version? I can update it to whatever is necessary, as long as I don't run into errors in CI (which would have to be addressed anyway—newer versions were supposed to be better, so if Awkward can't be compiled against it, that's something wrong with Awkward...)

Does updating pybind11 completely solve the problem? Everything's fine after that?

The issue you faced, is that something that could be added to the dependent project test? The dependent project runs in CI, so if this is an issue that wasn't fully treated before, I'd like to add it.

Thanks for looking into this, @veprbl!

@jpivarski jpivarski changed the title In dependent project, ak.layout fails to convert to ak::Content In dependent project, ak.layout fails to convert to ak::Content. Update pybind11! Oct 7, 2020
@raymondEhlers
Copy link
Contributor Author

raymondEhlers commented Oct 7, 2020

Excellent! I just cloned everything cleanly, with the only changes being updating to pybind11 2.5.0, and it seem to work with my minimal repo linked in my initial report. So I'm fairly convinced that this resolves the issue. 2.5.0 is the latest pybind11 stable release (there's a 2.6.0 beta 1).

I'm not sure what to make of the dependent project passing in the CI, but failing for me. With pybind11 2.5, it seems to build for me now. Most of the initial build problems were due to rapidJSON (other than the main issue that I report here), so those were perhaps due to a mistake with cloning rapidJSON. I just built it now in my clean directory and virtualenv with pybind11 2.5, and the only issue I see now is that when using the CMakeLists.txt, I needed to add an init.py to the build directory to get it to be recognized as a package. So in terms of adding a test to reproduce my issue, unfortunately I'm not sure what to suggest. Passing the ak.Array back and forth as is done in the dependent project was enough to see this issue. I imagine it's some combination of compilers, libraries, and options, but I'm not sure how to isolate them without a ton of work, sorry. I'm of course open to suggestions for how to isolate it though

raymondEhlers added a commit to raymondEhlers/awkward-1.0 that referenced this issue Oct 7, 2020
@raymondEhlers
Copy link
Contributor Author

It's a trivial PR, but I opened #485 to update to pybind11 2.5.0

jpivarski pushed a commit that referenced this issue Oct 7, 2020
@jpivarski
Copy link
Member

I'm not sure what to make of the dependent project passing in the CI, but failing for me.

I don't know if you were doing deeper tests. The dependent project only compiles against Awkward and passes some data to and from its own functions. I might not have passed an interesting enough data type for it to have triggered the bug you saw.

If dependent.cpp is doing something that's obviously less than what you had in your project, we could add what you have in your project to make it a better test, for the future.

The next release of Awkward will need a new minor version number: 0.3.1 → 0.4.0 because of the dependency version change. I'll need to start keeping an Awkward version → pybind11 version map somewhere, so that dependent projects can use compatible ones.

@raymondEhlers
Copy link
Contributor Author

Sorry, I think I wasn't super clear on this point: this issue was triggered just by passing arrays from python -> cpp. It also occurred trying to compile and run dependent.cpp on our cluster. So I don't see any obvious additional functionality to add to dependent.cpp that will trigger this issue. Which makes this especially puzzling to me. In any case, if I think of a test as I keep developing my code, I'll definitely open a PR.

Sounds good - I look forward to the next release!

Thanks again to you and @veprbl for all of the help!

@jpivarski
Copy link
Member

Oh, okay—a platform-dependent error (worked in CI, failed on your system). I'm glad the new pybind11 fixes that because it would be tough to debug.

The next version will likely come out when Python 3.9 can be tested, which will be as soon as it's added to Azure's pipelines. (c.f. microsoft/azure-pipelines-tool-lib#77).

@veprbl
Copy link
Contributor

veprbl commented Oct 9, 2020

It's a bit strange that updating to 2.5.0 resolved the issue. At first glance I don't see any relevant change: pybind/pybind11@80d4524...3b1dbeb

@jpivarski
Copy link
Member

It has been a long time since it was updated, I assumed there might have been something. @henryiii is recommending another upgrade to 2.6.0, for better Python 3.9 handling, so a version of Awkward based on 2.5.0 might never get released.

(If we have to track Awkward ←→ pybind11 version members, that would be easier if it's not changed often.

@raymondEhlers
Copy link
Contributor Author

raymondEhlers commented Mar 2, 2021

Sorry to revive an old issue, but unfortunately, it seems that this issue has resurfaced after updating to the new versions of awkward (1.2.0rc2) and pybind11 (2.6.2). Just trying to pass an awkward array into a c++ function on ubuntu 18.04, I get:

========================================================================================== FAILURES ===========================================================================================
______________________________________________________________________________________ test_numpy_array _______________________________________________________________________________________

    def test_numpy_array() -> None:
>       pba.awkward_test_numpy_array(ak.Array([1,2,3]).layout)
E       TypeError: awkward_test_numpy_array(): incompatible function arguments. The following argument types are supported:
E           1. (arr: awkward::NumpyArray) -> awkward::NumpyArray
E
E       Invoked with: <NumpyArray format="l" shape="3" data="1 2 3" at="0x562381da0240"/>


tests/test_simple.py:6: TypeError
_____________________________________________________________________________________ test_awkward_array ______________________________________________________________________________________

    def test_awkward_array() -> None:
>       pba.awkward_test(ak.Array([1,2,3]).layout)
E       TypeError: awkward_test(): incompatible function arguments. The following argument types are supported:
E           1. (arr: awkward::Content) -> awkward::Content
E
E       Invoked with: <NumpyArray format="l" shape="3" data="1 2 3" at="0x562381dada40"/>


tests/test_simple.py:9: TypeError
=================================================================================== short test summary info ===================================================================================
FAILED tests/test_simple.py::test_numpy_array - TypeError: awkward_test_numpy_array(): incompatible function arguments. The following argument types are supported:
FAILED tests/test_simple.py::test_awkward_array - TypeError: awkward_test(): incompatible function arguments. The following argument types are supported:
====================================================================================== 2 failed in 0.30s ======================================================================================

The full reproducer is here: https://github.com/raymondEhlers/pybind11-awkward-array-test . It was a few c++ functions in pybind11_awkward/src/binding.cpp, and then the tests leading to the failures above are in tests. Unfortunately, it's built with poetry, but you should be able to build it with pep 517:

# in some virtualenv
pip install .
pytest tests/test_simple.py

I must be doing something trivially wrong, but I can't seem to find it. Any advice is appreciated! (I can also start a new issue)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug (unverified) The problem described would be a bug, but needs to be triaged
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants