Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

create_pydantic_model returns models with wrong fields marked as null #1132

Open
HakierGrzonzo opened this issue Dec 13, 2024 · 5 comments · May be fixed by #1141
Open

create_pydantic_model returns models with wrong fields marked as null #1132

HakierGrzonzo opened this issue Dec 13, 2024 · 5 comments · May be fixed by #1141

Comments

@HakierGrzonzo
Copy link

When creating pydantic models with create_pydantic_model function, columns marked as null are not marked as null in the data. Instead the function uses the required property.

How to recreate

Consider the following application:

from piccolo.table import Table
from piccolo.utils.pydantic import create_pydantic_model
from piccolo import columns

# We need something that consumes the Pydantic models
from fastapi import FastAPI

import json

class SomeThing(Table):
    name = columns.Text(null=False)
    pet_name = columns.Text(null=True)
    age = columns.Integer(required=True, null=True)

some_model = create_pydantic_model(SomeThing)

app = FastAPI()

@app.get('/')
def foo() -> some_model:
    pass

# We print the OpenAPI schema generated for table `SomeThing`
print(json.dumps(app.openapi()['components']['schemas']['SomeThing']))

The output returned is:

{
  "properties": {
    "name": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "title": "Name",
      "extra": {
        "nullable": false,
        "secret": false,
        "unique": false,
        "widget": "text-area"
      }
    },
    "pet_name": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "title": "Pet Name",
      "extra": {
        "nullable": true,
        "secret": false,
        "unique": false,
        "widget": "text-area"
      }
    },
    "age": {
      "type": "integer",
      "title": "Age",
      "extra": {
        "nullable": true,
        "secret": false,
        "unique": false
      }
    }
  },
  "type": "object",
  "required": [
    "age"
  ],
  "title": "SomeThing",
  "extra": {}
}

As we can see, the fields name and pet_name are marked as str | None, despite having different null options set.

In contrast, the age field is marked as an int, which may never be None, despite it being possibly null in the database.

What causes this behavior:

is_optional = True if all_optional else not column._meta.required

In the above line, the type for the field is set as T | None based on the required property, instead of the null property.

This generates plausible models for writing to the database, but not for reading data from the database (we get rogue nullable properties).

Why does this behavior need to be changed:

Let's consider a simple change to the example above:

from uuid import uuid4
from piccolo.table import Table
from piccolo.utils.pydantic import create_pydantic_model
from piccolo import columns

from fastapi import FastAPI

import json

class SomeThing(Table):
    id = columns.UUID(default=uuid4(), null=False)

some_model = create_pydantic_model(SomeThing)

app = FastAPI()

@app.get('/')
def foo() -> some_model:
    pass

print(json.dumps(app.openapi()['components']['schemas']['SomeThing']))

We have an autogenerated uuid column, and we would like to create a model that describes a row in that table. We cannot mark id as required as it might be generated by the database.

The generated OpenAPI schema is:

{
  "properties": {
    "id": {
      "anyOf": [
        {
          "type": "string",
          "format": "uuid"
        },
        {
          "type": "null"
        }
      ],
      "title": "Id",
      "extra": {
        "nullable": false,
        "secret": false,
        "unique": false
      }
    }
  },
  "type": "object",
  "title": "SomeThing",
  "extra": {}
}

This causes various OpenAPI client generators to generate the following type:

interface SomeThing {
  id: string | null
}

This makes it seem that the id might be null, when it will never be null.

How to fix this bug:

For my purposes, I just run this patch:

64629ca

This causes piccolo to generate the correct models for read operations.

For write purposes, I just use all_optional.

@sinisaos
Copy link
Member

We have an autogenerated uuid column, and we would like to create a model that describes a row in that table. We cannot mark id as required as it might be generated by the database.

@HakierGrzonzo I just have one question? Why adding a required argument is problematic, can you please give some example? An required argument is for writing to the database, and that will ensure that we do not have null values ​​in the database when selecting data. Unless you have an existing database, which rows already have null values. In that case, we can add null=True in the Piccolo table definition. I'm sorry if I don't understand well and miss your point.

@HakierGrzonzo
Copy link
Author

I just have one question? Why adding a required argument is problematic, can you please give some example? An required argument is for writing to the database, and that will ensure that we do not have null values ​​in the database when selecting data.

@sinisaos Using required for specifying possibly null values in the schema fails in some ways:

  1. When a developer specifies null=False, required=False we get a lot of T | null values downstream, despite the database guaranteeing that the data will never be null (and any INSERT containing null will fail). This is a skill issue that happens very often (sample size=8, skill issue rate of 100%).
  2. When using PostgreSQL insert/update triggers, it is possible to add additional values to the INSERT query. This makes it so the field is NOT NULL in the database AND it is required=False in the piccolo schema.

Reason no.1 is common, annoying, but it is work-around-able.

Reason no.2 stems from piccolo not really working with database-driven defaults (see #919).

In conclusion, when generating the pydantic models, piccolo should treat the database as a source of truth for the can this field be null. The current required behavior when e.g. INSERTing data is ok, but when generating the models they should follow the database schema.

@sinisaos
Copy link
Member

sinisaos commented Dec 15, 2024

@HakierGrzonzo Thank you for your time and detailed explanation. You are right that if we use required=False we get T | null, but I assumed if we want a required field that is NOT NULLwe should use required=True and null=False like this.

def generate_uuid():
  return str(uuid.uuid4())

class SomeThing(Table, db=DB):
  uuid = columns.UUID(required=True, default=generate_uuid, null=False)
  # uuid = columns.UUID(required=True, default=generate_uuid) # null is False by default

I'm not sure if it's a good idea, but maybe we can force required to be True if null is False, to prevent user errors by changing this line

required=required,

to this

required=True if not null else required,

Thanks again.

@HakierGrzonzo
Copy link
Author

I'm not sure if it's a good idea, but maybe we can force required to be True if null is False, to prevent user errors by changing this line

This would fix the mentioned issue, thanks for responding.

@dantownsend
Copy link
Member

dantownsend commented Jan 14, 2025

Sorry for the slow reply on this.

I agree that the current behaviour is unintuitive. I remember people mentioning it in the past.

I've opened a PR, where null determines if it's optional or not:

class Band(Table):
    name = Varchar(null=True)

>>> create_pydantic_model(Band).model_fields['name'].annotation
Optional[str]

class Band(Table):
    name = Varchar(null=False)

>>> create_pydantic_model(Band).model_fields['name'].annotation
str

But required still works - it overrides null. The database column might be nullable, but if you still want the user to provide a value you can do this:

class Band(Table):
    name = Varchar(null=True, required=True)

>>> create_pydantic_model(Band).model_fields['name'].annotation
str

If you use all_optional it overrides everything, and the field is always Optional.

What do you both think?

It's potentially a breaking change, but because the behaviour of required is staying the same (i.e. it still takes precedent over null), hopefully it should be OK for most users.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants