Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement field length introspection and make available in schema #3552

Merged
merged 24 commits into from
May 23, 2023

Conversation

Ten0
Copy link
Member

@Ten0 Ten0 commented Mar 16, 2023

Resolves #3551

@weiznich
Copy link
Member

Thanks for working on this 👍
I'm general in favor of including these information in the generated schema.rs file. Although I'm not user whether the current approach is the best variant, especially in terms of abstractions that can be build later on top of this. I see the following alternatives:

Use a trait that is implemented for the column types for the size limit.

Something like that:

// the name of that trait is obviously up for discussion
trait SizeRestrictedColumn: Column {
    // maybe that should be even `Option<usize>` 
    // so that we can implement it for all columns?
    pub const MAX_LENGTH: usize;
}

// generated via table!
impl SizeRestrictedColumn for users::name {
    pub const MAX_LENGTH: usize = 42;
}

This allows to write generic code that abstracts over columns with certain properties. Additionally that would give a good place to document what's there.

Use const generics

A other probably more crazy idea is to use const generics on the sql type for this. So instead of having

#[derive(SqlType)]
struct VarChar;

we would have something like

#[derive(SqlType)]
struct VarChar<const SIZE: u32 = 0>;
// ideally we would use `Option<NonZeroU32>` as type there,
// but that's not supported yet

We could then include these information directly into the generated type:

table! {
    users {
         id -> Integer,
         // unrelated note: We might want to prefer
         // that syntax anyway?
         name -> VarChar<255>,
    }
}

that feels like a natural extension of the existing system. On the other hand I'm not sure yet whether that would be result in type coercing issues around the "different" text types (as each of VarChar<1>, VarChar<2> … is a different type then) and whether that would require an major breaking change.

@Ten0
Copy link
Member Author

Ten0 commented Mar 17, 2023

trait SizeRestrictedColumn -> I like that name :)

I see the following alternatives

I agree this current implementation doesn't enable building on top of it generically. I had considered these alternatives and eventually chose the current implementation, because

Const generics

There wouldn't be a 1-to-1 mapping between actual SQL types and our SQL types, which I suspect would make a bunch of stuff much more complex that it would otherwise need to be, while I'm not sure what concrete abilities we gain from doing that vs the trait variant.

The downsides of that would for instance be the point you underlined about coercion support, plus even if we make everything generic we would probably end up accidentally monomorphizing a bunch of times a bunch of different functions that don't need to be monomorphized, e.g. an implementation of FromSql<String, Varchar> can be reused regardless of the length of the particular varchar.

The trait variant

It's easier to reference without a trait as that way you don't need to have the trait in scope, and that's enough for the original use-case. Not creating a such trait right now doesn't prevent us from doing it later when we actually need it (and design it with Option or not Option depending on the actual use-case).

That being said one thing I hadn't considered was that we could simply put in the Diesel prelude, so arguably that would be reasonably easy to use then, and that would enable some generic possibilities on the user side. Also I hadn't considered where to put the documentation.

I'm going to rewrite it this way :)

The option variant

The original use-case is basically to be able to write:

let abc_def = input.abc_def.as_str().validated_len_lte(schema::abc::def::MAX_LENGTH)?;

when validating endpoint inputs.

I think it's nice in that context to be able to check at compile time that all input fields that end up being directly registered in the database are length-constrained. That also means users will typically not need a dedicated interface that takes an Option to perform such validation, and we avoid things such as xx::MAX_LENGTH.unwrap().

We can always add that as another trait later if that turns out to be useful.
That being said I'm also open to providing both right away (add const OPT_MAX_SIZE: Option<usize> on Column?)

We might want to prefer Varchar<123> as syntax

That makes parsing in the macro much harder because the parser will eat that part of the input as it attempts to parse a syn::Type. In addition if it actually isn't implemented in terms of generics on the SqlType, I fear that may be confusing to the users. (We can't use parentheses either because type parsing will try to consume them as well due to the Fn(A,B) -> C syntax, then error.)

@weiznich
Copy link
Member

Thanks for the explanation. Sounds reasonable for me 👍

If we ever decide that using const generics is the better solution that remains possible, so it's fine to go with the trait solution. Also the reasoning about Option vs non-Option is fine.

The only minor thing I would adjust is probably using NonZeroU32 as column size type, as postgres documentation states: "If specified, the length must be greater than zero and cannot exceed 10485760". For mysql the upper limit seems to be 255

We might want to prefer Varchar<123> as syntax

That makes parsing in the macro much harder because the parser will eat that part of the input as it attempts to parse a syn::Type. In addition if it actually isn't implemented in terms of generics on the SqlType, I fear that may be confusing to the users. (We can't use parentheses either because type parsing will try to consume them as well due to the Fn(A,B) -> C syntax, then error.)

I must admit that using {} looks strange in that context. If parsing Varchar<123> is to hard, what's about just using attributes there? So something like:

table! {
    whatever {
        #[size = 123]
        name -> Varchar,
    }
}

I feel that's closer to rusts syntax than the other solution and it should be easier to parse.

@Ten0
Copy link
Member Author

Ten0 commented May 15, 2023

what's about just using attributes there?

Didn't think of that option! That's indeed a much more readable syntax :)

using NonZeroU32

NonZeroU32 seems more powerful indeed.
Which do you think is the better option among:

  1. We update the trait so that it's a function and not a const and we can unwrap the NonZero construction in there
  2. Generating unsafe code as part of the table! macro to call new_unchecked (checking non-zero in the macro of course)
  3. Having the const be an u32

Copy link
Member

@weiznich weiznich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good now (beside the two minor things)

We should likely add a changelog entry for this as well.

diesel_cli/src/infer_schema_internals/data_structures.rs Outdated Show resolved Hide resolved
diesel_table_macro_syntax/tests/basic.rs.in Outdated Show resolved Hide resolved
@Ten0
Copy link
Member Author

Ten0 commented May 15, 2023

Looks good now

I think we haven't resolved your comment about u32 yet: #3552 (comment)

Co-authored-by: Georg Semmler <github@weiznich.de>
@weiznich
Copy link
Member

I totally missed that. I think I would go with variant 2, even if that involves unsafe code.

@Ten0
Copy link
Member Author

Ten0 commented May 16, 2023

I totally missed that. I think I would go with variant 2, even if that involves unsafe code.

Mysql goes from 0 to u16::MAX
That new information leads me to think that usize may actually be the better compromise (given that this is likely only gonna be used to compare with &str len or &[u8] len)

@weiznich
Copy link
Member

In that case usize seems reasonable, so go for that 👍. In the end the actual type probably does not matter that much as you can always cast it to whatever type you need in your application.

@Ten0 Ten0 marked this pull request as ready for review May 17, 2023 21:50
Co-authored-by: Georg Semmler <github@weiznich.de>
@weiznich weiznich mentioned this pull request May 22, 2023
@Ten0
Copy link
Member Author

Ten0 commented May 22, 2023

Should we consider the fact that we update the schema file that we generate a breaking change?
This would be somewhat breaking if people patch their generated schema.
If yes, we should probably make it a new option not enabled by default.
I have no strong opinion on the topic.
On the one hand, there's this potential instant breakage at 2.1, and on the other hand there's immediately making people aware of the new feature.

@weiznich
Copy link
Member

I wouldn't consider that as breaking change, because otherwise we cannot reasonable change anything in the generated schema.rs file without putting it behind an option.

…when field length is not specified at creation
@Ten0
Copy link
Member Author

Ten0 commented May 22, 2023

I will most likely squash-and-merge this once I'm done testing it on our codebase, most likely tomorrow.

@weiznich
Copy link
Member

👍

@Ten0 Ten0 merged commit 201e4b8 into diesel-rs:master May 23, 2023
@Ten0 Ten0 deleted the field_length_introspection branch May 23, 2023 13:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants