Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PydanticV2 breaks EdgeDB's Python CodeGen #466

Closed
jsimonlane opened this issue Oct 15, 2023 · 4 comments · Fixed by #468
Closed

PydanticV2 breaks EdgeDB's Python CodeGen #466

jsimonlane opened this issue Oct 15, 2023 · 4 comments · Fixed by #468
Assignees
Labels
bug Something isn't working

Comments

@jsimonlane
Copy link

jsimonlane commented Oct 15, 2023

Summary

EdgeDB's codegen is broken, because the codegen relies on Pydantic, but hasn't been updated to work with PydanticV2.

This matters because it's one of the first things a new user using Python will encounter (as the bug rears its head when you try to set up the EdgeDB-FastAPI tutorial if you don't use an old version of PyDantic)

Detailed problems

A get_user query will generate codegen that looks something like this:

class NoPydanticValidation:
    @classmethod
    def __get_validators__(cls):
        from pydantic.dataclasses import dataclass as pydantic_dataclass
        pydantic_dataclass(cls)
        cls.__pydantic_model__.__get_validators__ = lambda: []
        return []


@dataclasses.dataclass
class GetUserByNameResult(NoPydanticValidation):
    id: uuid.UUID
    username: str


async def get_user_by_name(
    executor: edgedb.AsyncIOExecutor,
    *,
    name: str,
) -> list[GetUserByNameResult]:
    return await executor.query(
        """\
        select User {
          username
        } filter User.username = <str>$name\
        """,
        name=name,
    )

There are two issues, at the surface manifestation and an underlying issue:

Surface
cls.__pydantic_model__.__get_validators__ should be substituted with cls.__get_validators__ because the __pydantic_model__ is no longer loaded as a method during pydantic_dataclass(cls)

Unfortunately, fixing that by deleting __pydantic_model causes a different error, namely pydantic_core._pydantic_core.SchemaError: Error building "chain" validator: (guessing this is because of the empty list in the validator).

Underlying
The query function returns an EdgeDB Object... but it's validation is skipped, and it's being duck-typed as a dataclass.

This isn't easy to follow (especially the NoValidation stuff), and it causes some strange errors when Pydantic eventually tries to validate it in the FastAPI response layer.

Potential fix

In my opinion, the cleanest way to solve this would simply be to initialize a dataclass object manually, and then use the codegen to fill in the fields. I illustrate that below. However, that would yield a key design choice on the type you use for the result objects.

# Option 1.
# Unfortunately, https://github.com/pydantic/pydantic/issues/7111
@dataclasses.dataclass
class GetUserByNameResult():
    id: uuid.UUID
    username: str

# Option 2.
# Unfortunately, this isn't plug and play.
@pydantic.dataclasses.dataclass
class GetUserByNameResult():
    id: uuid.UUID
    username: str

# Option 3.
class GetUserByNameResult(pydantic.BaseModel):
    id: uuid.UUID
    username: str

Option 1 is the best abstractly, since you don't need a pydantic dependency. However it suffers from this pydantic bug, which occurs if you use it with FastAPI.

Option 2 is nice, because it circumvents the above bug (see the bug notes on this), while keeping most of the dataclass interface.

Option 3 is nice too -- and most importantly, it lets you use the model_construct method to skip validation (if you want that).

My vote would be for option 2, because it works well:

  • The Pydantic dependency isn't a big deal -- you already have it in the current version.
  • It avoids the aforementioned bug, and is linked
  • It's performant to create
  • If you want to change it to a regular dataclass at some point, it's a drop in replace.

Notably, I think with all of these options, you can get rid of the NoPydanticValidation hack... since the object would have been validated during these steps (which presumably shouldn't fail, if your codegen is correct).

Also note: PydanticV2 claims to make this verification quite cheap, but you can use BaseModel.model_construct() if you want to avoid that.


Detailed journey:

While I created the webapp described in the python tutorial, I found the CLI is erroring out. The tutorial claimed that this was due to compatibility issues with Pydantic V2 (which is partially true -- see pydantic/pydantic#7111), but there is ALSO a problem with the edge-db codegen itself.

If you look at the edgeql query:

select User {
  username
} filter User.username = <str>$name

It produces python like:

# AUTOGENERATED FROM 'app/queries/get_user_by_name.edgeql' WITH:
#     $ edgedb-py


from __future__ import annotations
import dataclasses
import edgedb
import uuid


class NoPydanticValidation:
    @classmethod
    def __get_validators__(cls):
        from pydantic.dataclasses import dataclass as pydantic_dataclass
        pydantic_dataclass(cls)
        cls.__pydantic_model__.__get_validators__ = lambda: []
        return []


@dataclasses.dataclass
class GetUserByNameResult(NoPydanticValidation):
    id: uuid.UUID
    username: str


async def get_user_by_name(
    executor: edgedb.AsyncIOExecutor,
    *,
    name: str,
) -> list[GetUserByNameResult]:
    return await executor.query(
        """\
        select User {
          username
        } filter User.username = <str>$name\
        """,
        name=name,
    )

A key problem is that cls.__pydantic_model__ doesn't exist in PydanticV2... it should be replaced with cls.__get_validators__.

However, fixing that still causes errors if you ever try to check with Pydantic (when you try and run it through FastAPI), namely:

pydantic_core._pydantic_core.SchemaError: Error building "chain" validator:
  SchemaError: One or more steps are required for a chain validator

In my opinion, the cleanest solution would look like this. Of course, you'd tweak it to work well and abstractly with codegen (or maybe use some Dataclass(**edgedb_object) shorthan, etc).

The key idea is that you'd be using the code-gen to map in the individual fields of a NEW dataclass object.

#  AUTOGENERATED FROM 'queries/get_user_by_name.edgeql' WITH:
#    $ edgedb-py --dir queries

from __future__ import annotations
import dataclasses
import edgedb
import uuid
import pydantic

class GetUserByNameResult(pydantic.BaseModel):
    id: uuid.UUID
    username: str

async def get_user_by_name(
    executor: edgedb.AsyncIOExecutor,
    *,
    name: str,
) -> list[GetUserByNameResult]:
    r = await executor.query(
        """\
        select User {
          username
        } filter User.username = <str>$name\
        """,
        name=name,
    )
    users = [
        GetUserByNameResult(
            id=user_raw.id,
            username=user_raw.username
        )
        for user_raw in r
    ]
    return users

import asyncio
async def main() -> None:
    client = edgedb.create_async_client()
    r = await get_user_by_name(client, name="jsimonlane")
    print('ran')
    print(r)

if __name__ == "__main__":
    asyncio.run(main())

That should fix everything.

Versions (please complete the following information):
Pydantic==2.4.2
Edgedb==1.7.0

  • OS: MacOS 13.3.1 (Ventura)
  • edgedb-python version: 1.7.0
  • EdgeDB CLI: EdgeDB CLI 3.5.0+e1ad387
  • EdgeDB: EdgeDB 3.3+d5734dd (repl 3.5.0+e1ad387)
  • Python version:3.11

Additional context
Add any other context about the problem here.

@jsimonlane jsimonlane changed the title Bindings don't work with latest version of pipenv Python CodeGen doesn't work with PydanticV2 Oct 15, 2023
@jsimonlane jsimonlane changed the title Python CodeGen doesn't work with PydanticV2 EdgeDB Python CodeGen doesn't work with PydanticV2 Oct 15, 2023
@jsimonlane jsimonlane changed the title EdgeDB Python CodeGen doesn't work with PydanticV2 PydanticV2 breaks EdgeDB's Python CodeGen Oct 15, 2023
@yallxe
Copy link

yallxe commented Oct 30, 2023

Any updates?

@raddevon raddevon added the bug Something isn't working label Nov 14, 2023
@jsimonlane
Copy link
Author

In this conversation in the EdgeDB discord, @1st1 said:

@fantix and I will see if we can fix this on our end (or maybe we can help fix pydantic) after EdgeDB Day on November 1st

So I imagine this is in the works.

@fantix
Copy link
Member

fantix commented Nov 30, 2023

Thanks @jsimonlane for your proposal! I believe it would be better if we also:

  1. Avoid hard dependencies. This means even if the --skip-pydantic-validation switch is off, we still generate code that can be used both with and without Pydantic, so that the same generated code can be shared.
  2. Reduce overhead when possible, like we already know that, the type of query result is always the same as expected, so running Pydantic validation is just unnecessary. It would be the best if we can construct the query values only once, and pass them to the Pydantic v2 Rust serialization directly.

Thus, the hack is unfortunately to stay. I pushed a fix in #468

@jsimonlane
Copy link
Author

Thanks for the fix, @fantix .

A note on the fix in #468 -- the fix got most of the way there, but it turns out it broke FastAPIs OpenAPI schema generation, because the fix effectively type-erases the Pydantic type (instead of just skipping validation). FastAPI uses thta type to generate its OpenAPI schema, but now that it is 'any', it is extremely unhelpful.

This likely won't matter for the average user, but if you're trying to make everything type safe via the OpenAPI, it will affect things.

For instance if you look at FastAPI's EdgeDB tutorial, then that will break OpenAPI.

I got around this by essentially re-declaring the output object, thus separating out my EdgeDB layer from my FastAPI layer -- which is probably better design, anyway.

Something to note.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants