binding(Python): Use Python classes for codec messages #559

HenningHolmDE · 2024-07-15T09:01:41Z

Currently, the codec messages are typeless dictionaries, e.g. {"tag": "a", "body": "Noop"}.

Using classes (e.g. Command, Greeting) for each of these messages would allow making it more clear what type the methods accept and return, even if these classes itself are constructed from dictionaries, e.g. Command({"tag": "a", "body": "Noop"}).

The text was updated successfully, but these errors were encountered:

duesee · 2024-07-15T15:54:32Z

I think this is a reasonable next step towards a pythonic API.

AzideCupric · 2024-07-21T13:29:22Z

I feel like the current Python side of type hints (pr #568) could still use a little more depth, and the internal structure is still a bit mysterious.

I'm not super familiar with Rust and PyO3, so I read the PyO3 documentation, but I couldn't find a way to export fields from a struct with a lifecycle in Rust to Python via PyO3.

The PyO3 documentation states that #pyclass(get_all, set_all) can make Python access all fields of a Rust struct, but it unfortunately doesn't work on something like:

#[pyclass(name = "Greeting", eq, get_all, set_all)] //Err: `get` and `set` with tuple struct fields require `name`
pub(crate) struct PyGreeting(pub(crate) Greeting<'static>);

So I gave it a go on my own fork, trying to convert the types in imap-types to Pydantic's model on the Python side and rebuild it back to a structure similar to the Rust side from the exported dict. Here are some attempts and tests I made.

HenningHolmDE · 2024-07-22T11:41:45Z

@AzideCupric Thank you for your input!

I feel like the current Python side of type hints (pr #568) could still use a little more depth, and the internal structure is still a bit mysterious.

Could you elaborate a bit more on this? Then we can guide further development in the right direction with respect to developer experience. Currently, we aim at providing some basic way of using the capabilities of imap-codec for Python developers while not doing a reimplementation of the library in Python.

I'm not super familiar with Rust and PyO3, so I read the PyO3 documentation, but I couldn't find a way to export fields from a struct with a lifecycle in Rust to Python via PyO3.

The PyO3 documentation states that #pyclass(get_all, set_all) can make Python access all fields of a Rust struct, but it unfortunately doesn't work on something like:
#[pyclass(name = "Greeting", eq, get_all, set_all)] //Err: `get` and `set` with tuple struct fields require `name`
pub(crate) struct PyGreeting(pub(crate) Greeting<'static>);

Indeed this seems to be one of the largest issues for improving the Python experience for the types. Because imap-types aims on allowing zero-copy implementations, the corresponding types have lifetime parameters (e.g. Greeting<'a>) for allowing the compiler to track the lifetime of the borrowed external data.
One way around this is creating newtype-like wrappers for the pyclass implementation (e.g. struct PyGreeting(Greeting<'static>)), which bind the lifetime to the remainder of the programs execution ('static). However, in this case all functionality to be available from Python has to be re-implemented (or forwarded) manually on these wrappers, because - as you pointed out - the PyO3 macros do not access the implentation of the wrapped type.
This is the reason why I started this implementation on the highest layer for the codecs' interface types (fragments, messages) only.

Going forward, it might be possible to change automatically switch the lifetime parameters of the types in imap-types to 'static when used in conjunction with PyO3, but I did not dig into that any deeper yet.

So I gave it a go on my own fork, trying to convert the types in imap-types to Pydantic's model on the Python side and rebuild it back to a structure similar to the Rust side from the exported dict. Here are some attempts and tests I made.

I really like the idea of using Pydantic to create a verifyable model. However, looking at your implementation for greeting, wouldn't this basically lead to a complete reimplementation of imap-types in Pydantic?
The basic idea would then be to be able to convert between the Pydantic representation and the one using imap-types and reuse imap-codec for encoding/decoding?

AzideCupric · 2024-07-23T08:55:36Z

Thank you for your patience.

Could you elaborate a bit more on this? Then we can guide further development in the right direction with respect to developer experience.

I am trying to implement an Email adapter for the NoneBot project. To do this, I am looking for a Python library that can effectively decode and encode IMAP protocol commands and responses. When I found imap-codec, I noticed it was preparing to release Python bindings. I tried using the current binding code but found that the decode method's output is only a dict, lacking clear internal structure and type hints.

For instance, when the server returns a response like:

* 3 EXISTS
* 2 EXPUNGE
A002 OK IDLE terminated

I would like it to parse into Python classes:

# Now: 
{'Data': {'Exists': 3}}
{'Data': {'Expunge': 2}}
{'Status': {'Tagged': {'tag': 'A002', 'body': {'kind': 'Ok', 'code': None, 'text': 'IDLE terminated'}}}}
# Want: 
Data(data=Exist(3))
Data(data=Expunge(2))
Status(status=Tagged(
    tag='A002',
    body=Body(
    	kind="Ok",
        code=None,
        text="IDLE terminated"
    )
))

This approach would allow using match-case to handle different results and benefit from type-checking tools (like pyright) for type inference.

match ResponseCodec.decode(...):
    case Status() as s:
        print(s.body.kind, s.body.text)
    case Data(data=d):
        match d:
            case Exists() as e:
                emit_event("new_email", e.exists)
            case Expunge() as e:
                emit_event("del_email", e.expunge)
            case _:
                ...
    case _:
        ...

While pure dicts can also use match-case:

match ResponseCodec.decode(...):
    case {"Status": s}:
        ...
    case {"Data": d}:
        match d:
            case {"Exists": e}:
                ...
            case {"Expunge": e}:
                ...
            case _:
                ...
    case _:
        ...

This requires manually entering key values, which has the risk of typos/structural errors, and type-checking tools can't infer the type of s in {"Status": s} and other more.

Additionally, using classes makes it easier to construct a command to pass to the encode method, which is more convenient than manually typing the whole dict and allows type-checking tools to provide assistance and checks.

cmd = {'tag': 'a002', 'body': {'Select': {'mailbox': 'Inbox'}}}
# better
cmd = Command(tag=Tag("a002"), body=Select(mailbox=Mailbox.Inbox))
encoded_cmd = CommandCodec.encode(cmd.model_bump())

However, looking at your implementation for greeting, wouldn't this basically lead to a complete reimplementation of imap-types in Pydantic?

Even if PyO3 could convert Rust structs with lifetimes into the correct Python classes, it wouldn't automatically generate the class stubs on the Python side (Python Interface (.pyi) generation and runtime inspection). So, even if #[pyclass] worked, the .pyi files would still need to be manually written. What I am doing now is akin to manually constructing .pyi files, but also supporting serialization and deserialization of the dict output from the decode method. I don't see this as a reimplementation of imap-types in Pydantic because the dict output from the decode method is already the result of imap-codec parsing, with fixed keys and structure.

The basic idea would then be to be able to convert between the Pydantic representation and the one using imap-types and reuse imap-codec for encoding/decoding?

Yes, that's correct. Pydantic only needs to reconstruct the model based on the dict output, without performing complex validation (which is already handled by imap-codec). The model is then exported and passed back to the encode method, ensuring correctness by imap-codec.

I hope this clarifies my suggestions and the reasoning behind them. I would be happy to provide more details or assistance if needed.

github-project-automation bot added this to imap-{types,codec} Jul 15, 2024

HenningHolmDE mentioned this issue Jul 15, 2024

Implement initial iteration of Python bindings for imap-codec #517

Merged

HenningHolmDE mentioned this issue Jul 21, 2024

feat: Use separate Python classes for messages #568

Merged

duesee closed this as completed in #568 Jul 23, 2024

github-project-automation bot moved this to Done in imap-{types,codec} Jul 23, 2024

duesee added this to the 12. Python bindings milestone Jul 29, 2024

AzideCupric mentioned this issue Aug 5, 2024

bindings(python): better message type #586

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

binding(Python): Use Python classes for codec messages #559

binding(Python): Use Python classes for codec messages #559

HenningHolmDE commented Jul 15, 2024

duesee commented Jul 15, 2024

AzideCupric commented Jul 21, 2024

HenningHolmDE commented Jul 22, 2024

AzideCupric commented Jul 23, 2024

binding(Python): Use Python classes for codec messages #559

binding(Python): Use Python classes for codec messages #559

Comments

HenningHolmDE commented Jul 15, 2024

duesee commented Jul 15, 2024

AzideCupric commented Jul 21, 2024

HenningHolmDE commented Jul 22, 2024

AzideCupric commented Jul 23, 2024