-
Notifications
You must be signed in to change notification settings - Fork 762
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add pyo3::experimental
namespace with new owned PyAny<'py>
#3205
Conversation
I was likely not part of the previous discussions which happened on this topic, so please just tell if something was discussed and decided before. One thing that feels off in the proposal is the need to have separate types |
pyo3::experiemental
namespace with new owned PyAny<'py>
pyo3::experimental
namespace with new owned PyAny<'py>
That's a great question, and I've got reasons why that didn't work with the current state of things. I'll try to find time to write that out here tomorrow or on Friday. |
As promised... TLDR; I see three main API options, all with drawbacks, the proposal here seemed like the best compromise for now. There might be other designs I have not considered and I would be extremely excited for any proposal which is better than these three. I played around with this kind of The issue I keep running into is related to let py: Python<'py> = /* ... */;
// possible return type of `PyDict::new` in a world without the pool could be `GIL<'py, PyDict>`
let dict: GIL<'py, PyDict> = PyDict::new(py);
dict.set_item("foo", "bar")?;
// return type of `PyDict::get_item` would similarly be `GIL<'py, PyAny>`.
let bar: GIL<'py, PyAny> = dict.get_item("foo").expect("foo is in dict"); I think we want this snippet to compile. However, the naive impl<T> Deref for GIL<'_, T> {
type Target = T;
fn deref(&self) -> &T {
self.0
}
} and the problem with that is that the Replacing I also explored something like this: impl<'py, T> Deref for GIL<'py, T> {
type Target = &'py T;
fn deref(&self) -> &&'py T {
// where self.0 is `*mut ffi::PyObject`
unsafe { std::mem::transmute(&(self.0)) }
}
} That may seem better, because it has the Python::with_gil(|py| {
let bare_mapping = PyDict::new(py).as_mapping();
assert_eq!(bare_mapping.get_refcnt(), 1); // Blows up with get_refcnt() as 0 This is because the The path forward which could work for a pub fn as_mapping<'py>(self: &GIL<'py, Self>) -> &GIL<'py, PyMapping> {
/* tbc */
} This is unfortunately not in stable Rust, though if we really believed this was the best option (it might be) we could consider lobbying upstream to see what work would be needed to stabilise. Even once stabilised that will be a future PyO3 design, rather than something we can use now. We can almost get the same effect as arbitrary self types if we implement all methods on impl<'py> GIL<'py, PyDict> {
pub fn as_mapping(&self) -> &GIL<'py, PyMapping> {
/* tbc */
}
} which compiles today and has the correct lifetime characteristics. The big downsides I see of this are:
I'm not completely against going this route, but it simple doesn't feel that nice. Maybe if we got a signal from upstream that arbitrary self types (or at least The conclusion I drew was that whichever API we go for, GIL-bound references no longer make sense once the pool is removed. I think I have three options which create workable APIs:
One solution I'm open to is that for now we implement option 1 as This obviously has increased maintenance burden, however personally I'd be willing to take that pain if it means PyO3 is more performant. I do not expect other maintainers may agree with so much additional maintenance overhead. Maybe we eventually find another API design outside these three which we like even better, and rework |
As an example of the kind of performance differences I see on this branch, here's the benchmarks from the
This branch is noticeably better at handling the |
I agree that the
though. Is there any downside to just using the |
Excitingly, I see there is recent movement on rust-lang/rust#44874 - so maybe the arbitrary self types are not completely out of reach!
Good question. I suppose my objection is that there is no such thing as a |
One silly knot I still have in my head: The reference counting semantics of Couldn't e.g. Of course, this would probably imply a lot of repetitive code snippets like let dict = PyDict::new(py);
let dict = dict.as_ref(py);
... but maybe this could be handled similarly to std's let dict = bind!(py, PyDict::new(py));
... |
Of course, fn as_ref<'py>(&'py self, _py: Python<'py>) -> &'py T::AsRefTarget this will usually be some lifetime |
Ok, one other probably dead-end idea: Keep the pool, but make the |
So to answer myself, this fails because we do not always thread the token through but rely on being able to produce it "out of thin air", especially via GIL-bound references, i.e. |
So, after thinking about this some, I think Taking all of this into consideration, I think I would prefer to avoid the duplication of the experimental namespaces. Personally, in light of the amount of thinking and experimenting already invested into this topic, I think that would be too cautious. Of course, there is also the additional amount of required work compared to the limited amount of available time. Hence, I would suggest that we do switch over PyO3 proper for 0.20 but add a more lengthy pre-release cycle, e.g. making 0.20-alpha.1 pre-releases and actively call for testing by the ecosystem. Hopefully, a lot of code will still compile as EDIT: I forgot "to avoid" in "prefer to avoid the duplication"... 🤦🏽 |
I suspect this would require GAT to work safely, but I also think it could be worked around using a HKTB and helper trait like e.g. here |
This is an interesting proposal, we could certainly allow plenty of time for downstream code to test prereleases. It would be a delight to get rid of the pool. I guess my main concern would be - should I suppose we could make the argument that with MSRV of 1.52 we're about 2 years behind stable Rust, so even if arbitrary self types landed in the next year it may not be for another three years until our MSRV would support them. So, if 0.20 was released without the pool, the next migration would be a few years away. |
Personally, I would actually say no. Or at least only if there is some tangible benefit for ergonomics that goes beyond getting rid of the hard to explain If arbitrary self types were available today, I think we would design our smart pointers around them. But path dependencies are a fact of life and I see no strong reason to hide that our design originates in today's Rust with all the rough edges it still has, and not the language that is probably two to three MSRV bumps away still. |
Agreed, we can revisit that (far) in the future and see what makes most sense for the API and users. I think after some reflection it's probably worth getting 0.20 out before merging these changes, and then make this the focus for 0.21 (with a long release cycle as proposed). I think if we're committing to a slow release we will probably also want to have a stable branch which will be easy to backport bugfixes onto, and I'm heavily demotivated from doing any 0.19.x patch releases what with the removal of bors and MSRV bump 😅 |
Agreed, the feature list for 0.20 seems sufficient to warrant a release and the maintenance effort for 0.19 would indeed be high. This does re-open the question of whether we want a unsafe/nightly-safe |
Given that 0.21 will probably be some effort for users to migrate to, I see no harm in adding new APIs which may only exist on 0.20 if it helps them in the period before they can invest resources to migrate. |
I was wondering how does this effect pyo3/pyo3-macros-backend/src/pyclass.rs Lines 199 to 205 in 0a4806f
And updated implementations of Is there already a plan about how this would work in the future? |
This would continue to work as it does now, with the only difference of using |
A quick thought - should I write up and pin an issue (and maybe a discussion which links to the issue, or vice-versa) inviting users to give feedback about the intention of going through this API change in 0.21? I'd hope users will generally respond positively or neutrally (or mostly not at all I suppose), however if we get a lot of early pushback then that could save us a lot of pain in trying to work through this. |
Closing in favour of #3361 |
This is my leading proposal for how we might solve #1056 and remove the idea of the "pool" from the
Python
marker.From my benchmarking here, this can net up to 30% wins in performance for PyO3 interactions with Python objects as well as avoid the memory growth related to the object pool. I rewrote the argument extraction logic using this new API to demonstrate the potential. Will post benchmarks below when I have a moment.
The fundamental idea is that the gil-bound reference types like
&'py PyAny
which frequently reside in the pool are replaced byPyAny<'py>
owned types with a lifetime attached. They provide an almost identical API with the difference being ownership semantics which can essentially be thought of asRc
.The new type is added as
pyo3::experimental::PyAny
, along with accompanying implementations ofPyDict
andPyList
. We also need to have replacements of the existing conversion traits e.g.pyo3::experimental::FromPyObject
.What I'd like to do if we like this direction is to make
pyo3::experimental
a public API which we state is a possible future direction of PyO3 which we are using for research, so may change fast. Ideally users can importpyo3::experimental::prelude::*
and just use the "new PyO3" without much challenge. Eventually we could make thispyo3::v2::prelude::*
and then go through a deprecation / removal cycle of the old stuff.We can implement the bulk of PyO3 using the new API to learn about it and get the maximum performance speedup internally, though we have to be aware this does create quite a lot of maintenance overhead.
There are downsides of adopting this new API (this list is incomplete and we can fill this out in this PR and eventually in the docs):
PyAny<'py>
lifetime being attached everywhere.&PyAny
->Vec<&str>
is unsupported because the&str
needs to be owned by a Rust type. At the moment the intermediate&PyString
references go in the pool.Py<PyAny>
and friends for storage without lifetimes.Py<PyAny<'static>>
is nonsensical because'static
GIL lifetime is kinda meaningless.PyDetachedAny
,PyDetachedDict
etc as the types to replacePy<PyAny>
,Py<PyDict>
etc. I don't love this but maybe it's the best choice. Also with PEP 630 as a possible thing to aim for we probably want to be making static storage less ergonomic anyway.Also the branch is currently out-of-date and unfinished, this is posted here to invite discussion. Sorry if you are reading the code 😆
With apologies I have to dash now, will do my best to add and explain any thoughts here as questions come in.