"lazy" state of `PyErr` ... has challenges #4584

davidhewitt · 2024-09-27T15:50:20Z

PyErr contains some arcane trickery internally to avoid creating a Python exception object, by making some "lazy" state which boxes a PyErrArguments object.

While the original design predates me, I believe that it was done for performance to avoid creating a Python exception object inside of pure-Rust code.

I think there might be a few reasons to revisit this design.

I think the performance motivation is not clear-cut; it makes PyO3's error handling relatively expensive because every error pathway has to box up the state into a dyn Trait before then creating a Python exception.
I think we've learned that a better design is to allow APIs to be generic on the error type where relevant, i.e. in IntoPyObject (and I guess we could do the same in FromPyObject). The proc macros have for a long time allowed any error type which is convertible to PyErr, too.
I realize as I write this issue that PyErrState::normalize is not sound under the freethreaded build: it uses the GIL for synchronization.
The boxed PyErrArguments state is not traversible by the Python GC. In theory this means that the state can contain arbitrary Python objects, could therefore participate cycles and cause memory leaks. We would need to make breaking changes to the trait to fix this.

I propose that instead we remove the ability for PyErr to be lazy and make it a thin wrapper around Py<PyBaseException>. This is essentially the direction that Python went in 3.12 for the global SetRaisedException ffi calls.

It would have the effect of creating a PyErr to change performance profile; instead of creating a boxed lazy exception state it would immediately create the Python object. This might be faster, it might be slower. But I think it would simplify things which might overall improve things.

Due to point 3, i.e. that it's not threadsafe, I'm tempted to just try to land this ASAP on 0.23. cc @ngoldbaum

I could at least make a branch and see what the benchmarks say.

The text was updated successfully, but these errors were encountered:

alex · 2024-09-27T16:02:48Z

Hmm, can you say more about (3)?

…

On Fri, Sep 27, 2024 at 11:50 AM David Hewitt ***@***.***> wrote: PyErr contains some arcane trickery internally to avoid creating a Python exception object, by making some "lazy" state which boxes a PyErrArguments object. While the original design predates me, I believe that it was done for performance to avoid creating a Python exception object inside of pure-Rust code. I think there might be a few reasons to revisit this design. 1. I think the performance motivation is not clear-cut; it makes PyO3's error handling relatively expensive because every error pathway has to box up the state into a dyn Trait before then creating a Python exception. 2. I think we've learned that a better design is to allow APIs to be generic on the error type where relevant, i.e. in IntoPyObject (and I guess we could do the same in FromPyObject). The proc macros have for a long time allowed any error type which is convertible to PyErr, too. 3. I realize as I write this issue that PyErrState::normalize is not sound under the freethreaded build: it uses the GIL for synchronization. 4. The boxed PyErrArguments state is not traversible by the Python GC. In theory this means that the state can contain arbitrary Python objects, could therefore participate cycles and cause memory leaks. We would need to make breaking changes to the trait to fix this. I propose that instead we remove the ability for PyErr to be lazy and make it a thin wrapper around Py<PyBaseException>. This is essentially the direction that Python went in 3.12 for the global SetRaisedException ffi calls. It would have the effect of creating a PyErr to change performance profile; instead of creating a boxed lazy exception state it would immediately create the Python object. This might be faster, it might be slower. But I think it would simplify things which might overall improve things. Due to point 3, i.e. that it's not threadsafe, I'm tempted to just try to land this ASAP on 0.23. cc @ngoldbaum <https://github.com/ngoldbaum> I could at least make a branch and see what the benchmarks say. — Reply to this email directly, view it on GitHub <#4584>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAAAGBHKOTZPQ2TEABBWC3LZYV5FFAVCNFSM6AAAAABO7N27JKVHI2DSMVQWIX3LMV43ASLTON2WKOZSGU2TGMRVGMYDINI> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

-- All that is necessary for evil to succeed is for good people to do nothing.

davidhewitt · 2024-09-27T17:19:07Z

The state is contained in an UnsafeCell, and we don't try to synchronize writes and reads because we require py: Python to e.g. PyErr::value

alex · 2024-09-27T17:25:03Z

Ah, yes, sorry the unsoundness is really in make_normalized, not normalize.

alex · 2024-09-27T17:25:33Z

This whole thing is really just a GILOnceCell, right?

davidhewitt · 2024-09-27T18:08:05Z

Not exactly, it consumes it's original state and rebuilds it in place when normalizing. You're right it's write once, though. I think the in place rebuild implies a lock if we wanted to keep the current design.

davidhewitt · 2024-09-28T05:49:09Z

I realize that one option is to leave the design as is and just replace the UnsafeCell with a mutex on the freethreaded build. But I think that might have unsatisfying perf consequences and maybe (slim) possibility of deadlocks.

That might be good enough for 0.23, however I still feel removing this lazy behaviour might be the correct long term option.

mejrs · 2024-09-28T15:59:18Z

I think we've learned that a better design is to allow APIs to be generic on the error type where relevant, i.e. in IntoPyObject (and I guess we could do the same in FromPyObject). The proc macros have for a long time allowed any error type which is convertible to PyErr, too.

however I still feel removing this lazy behaviour might be the correct long term option.

I'm considering whether we should just have two error types, one being the cheap one and the other holding the python exception object (Maybe we could represent this with a generic parameter). We can probably keep the breaking changes to a minimum by having the default parameter be the uninstantiated variant and having most of the conversions happen inside a proc macro.

davidhewitt · 2024-09-29T07:55:54Z

I'm open to this, can you maybe elaborate a bit further? I think you're right that ripping the lazy state out is maybe more breaking than I originally considered, so we should think more carefully.

I think based on @alex's observation that this is basically a GILOnceCell, I have a sketch of an idea to use a Once to block on the freethreaded build to keep things sound. I'll try to play with that soon, and maybe that's better to land to unblock 0.23 without changing existing semantics on the default builds.

davidhewitt · 2024-12-05T09:35:15Z

#4764 makes me more convinced that we need to take the "lazy" state out - while in the particular polars case there the exceptions are coming from Python and don't need to use the lazy mechanism, in principle Rust-originating exceptions which use the lazy mechanism would trigger a similar deadlock.

There's just too much going on behind the scenes in these types, and there's enough places in PyO3 now that we allow errors-convertible-to-PyErr that I'm very much in favour of removing lazy state for 0.24.

Another argument in favour as well beyond removing all the hidden complexity: this might improve performance and stack memory consumption. Because of the heavy weight of PyErr:

[src/main.rs:4:5] std::mem::size_of::<PyResult<PyObject>>() = 64
[src/main.rs:5:5] std::mem::size_of::<PyErr>() = 56
[src/main.rs:6:5] std::mem::size_of::<PyObject>() = 8

... if PyErr was just a pointer to an exception value (so also 8 bytes), I expect the size of PyResult<PyObject> would be just 16 bytes.

This was referenced Sep 27, 2024

[RFC] split PyErr::new() into multiple methods #4413

Draft

Tracking issue for no-gil/freethreaded work #4265

Open

This was referenced Oct 9, 2024

Contribution opportunity: improved conversion traits #4041

Closed

deprecate IntoPy in favor or IntoPyObject #4618

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"lazy" state of `PyErr` ... has challenges #4584

"lazy" state of `PyErr` ... has challenges #4584

davidhewitt commented Sep 27, 2024

alex commented Sep 27, 2024 via email

davidhewitt commented Sep 27, 2024

alex commented Sep 27, 2024

alex commented Sep 27, 2024

davidhewitt commented Sep 27, 2024

davidhewitt commented Sep 28, 2024

mejrs commented Sep 28, 2024

davidhewitt commented Sep 29, 2024

davidhewitt commented Dec 5, 2024

"lazy" state of PyErr ... has challenges #4584

"lazy" state of PyErr ... has challenges #4584

Comments

davidhewitt commented Sep 27, 2024

alex commented Sep 27, 2024 via email

davidhewitt commented Sep 27, 2024

alex commented Sep 27, 2024

alex commented Sep 27, 2024

davidhewitt commented Sep 27, 2024

davidhewitt commented Sep 28, 2024

mejrs commented Sep 28, 2024

davidhewitt commented Sep 29, 2024

davidhewitt commented Dec 5, 2024

"lazy" state of `PyErr` ... has challenges #4584

"lazy" state of `PyErr` ... has challenges #4584