-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for Postgres enums #580
Conversation
…ostgres enums. No actual encoding/decoding is done yet, but codegen compiles and looks reasonably okay. Testing is lacking, so the code is far from guaranteed from being correct at this point.
Encoding/decoding of actual enum values is not yet supported. (impls are suggested but just call unimplemented!). Enum-valued table columns is also yet not supported.
New impls of HasSqlType cannot be defined for types inside the diesel crate, so we can't do something like: impl HasSqlType<NewType> for diesel::pg::Pg { ... } This commit adds the trait ProvidesSqlType, which can be implemented for new types, and a blanket impl so that any type with a suitable impl of ProvidesSqlType gets a corresponding impl of HasSqlType. So new types can be added with an impl declaration like: impl ProvidesSqlType<diesel::pg::Pg> for NewType { ... }
diesel_codegen/Cargo.toml: add dependency on syntex_syntax in an attempt at getting access to the "self" keyword for qualifying symbol lookups. May not be the right way to go. diesel_codegen/src/lib.rs: raise macro recursion limit to enable builds with inferred enum types. Pull in syntex_syntax. diesel_codegen/src/schema_inference.rs: propagate information about types being builtin or derived from the DB schema. Some fuzziness here about how exactly to do local name lookups (WRT "self" and not the crate root). diesel_codegen/src/type_inference.rs: move some utility functions to diesel_codegen_shared/src/schema_inference/pg.rs. Update enum code generation to try to implement with all necessary interfaces. diesel_codegen_shared/src/lib.rs: add conditional dependency on itertools. diesel_codegen_shared/src/schema_inference/data_structures.rs: track whether a column type is a builtin or custom. diesel_codegen_shared/src/schema_inference/mod.rs: conditional dependency on itertools, export utility functions. diesel_codegen_shared/src/schema_inference/pg.rs: move utility functions here. Add hacky heuristic for detecting if a type is built-in. diesel_codegen_shared/src/schema_inference/sqlite.rs: mark all SQLite types as builtin. diesel_tests/tests/associations.rs: add values for new enum type columns. diesel_tests/tests/boxed_queries.rs: add values for new enum type columns. diesel_tests/tests/errors.rs: add values for new enum type columns. diesel_tests/tests/expressions/mod.rs: add values for new enum type columns. diesel_tests/tests/filter.rs: add values for new enum type columns. diesel_tests/tests/filter_operators.rs: add values for new enum type columns. diesel_tests/tests/find.rs: check values of new enum type columns. diesel_tests/tests/insert.rs: insert values for new enum type columns. diesel_tests/tests/joins.rs: check values of new enum type columns. diesel_tests/tests/macros.rs: add values for new enum type columns. diesel_tests/tests/order.rs: add values for new enum type columns. diesel_tests/tests/schema.rs: add a call to infer_enums! so that we get our new enum types. Unfortunately, it's still necessary to add an additional "pub use self::blah" because I have not yet figured out how to import the new enum types from their captor module automagically. diesel_tests/tests/select.rs: select values for new enum type columns. diesel_tests/tests/update.rs: update values for new enum type columns. It might be a good idea to try to add a test updating an enum typed column. migrations/postgresql/20161224024421_add_enum_columns/{up,down}.sql: add a column for a new enum type. Still don't yet have things working in custom Postgres schemas yet.
This feels like a bit of a hack, since it side-steps the type resolution issue and continues to rely on a hard-coded heuristic for figuring out which types are declared where, but it seems reasonable. Users of the library can declare or "pub use" anything they need in the namespace they're working with. This leaves types in schemas out in the cold, unless we're willing to assume some sort of convention of parallel structure between Rust namespaces in the extra types module and DB schemas. diesel_codegen/src/lib.rs: bring up to date with changes from master. diesel_codegen/src/schema_inference.rs: Thread extra types module through. diesel_codegen/src/type_inference.rs: bring up to date with changes from master. diesel_codegen_shared/src/schema_inference/mod.rs: Thread extra types module through. diesel_codegen_shared/src/schema_inference/pg.rs: Thread extra types module through and resolve non-builtin types in that namespace. diesel_tests/tests/lib.rs: new module for automagic enum decls. diesel_tests/tests/schema.rs: pass new macro parameter for the extra types module
…tom_types_in_database_columns
Nice! I saw your branch a while ago, and am glad to see this PR. This is quite a lot of code! I'll try to read through it this evening, or tomorrow afternoon. Feel free to bug me if I forget about it though :) |
Some record of the original variant name is necessary so that SQL generation uses the right literals.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whew, this is a lot of new code. That's amazing work!
I read through it, but probably not as in-depth as I should have (esp. the type_inference
module). You seem to be working on adding even more things anyway :)
Before this reaches the magic 1000 lines mark: Do you think it'd make sense to land this in several batches? E.g., adding support for custom schemas and enums in PG array can probably be added in future PRs.
|
||
fn infer_enums_for_schema_name(database_url: &str, schema_name: Option<&str>, | ||
type_list: Option<&str>, camel_case_types: bool, | ||
camel_case_variants: bool) -> quote::Tokens { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume this is rustfmt? Can you try to match the fn declaration style used in e.g. column_def_tokens
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Formatting is mostly a combination of what rust-mode does in Emacs and my attempts at respecting cosmetic issues like line length as per a hazy recollection of the recommendations aturon compiled. If you prefer, I'll run rustfmt over things before declaring this ready for a merge.
infer_table_from_schema!(#database_url, #table_name); | ||
} | ||
pub use self::#mod_ident::*; | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you refactor this to let macro_call = if let Some(type_module) = extra_types_module { quote! { infer_table_from_schema!(extra_types_module=#ype_module, #database_url, #table_name); } } else { quote! { infer_table_from_schema!(#database_url, #table_name); } }
so the mod
and pub use
doesn't need to be repeated?
@@ -1,4 +1,5 @@ | |||
use syn; | |||
// use syntex_syntax::symbol::keywords; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the only place where you use syntex right? Please please please remove this dependency if it's not going to be used, it takes ages to compile. (And if we need something that syn doesn't have in this regard, I'd gladly make a PR to syn!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I was trying to use syntex to work around some issues but don't need it right now. I'll kill the dependency if it's not needed (which it probably won't be) before any merge.
@@ -74,19 +88,51 @@ mod information_schema { | |||
} | |||
} | |||
|
|||
pub fn determine_column_type(attr: &ColumnInformation) -> Result<ColumnType, Box<Error>> { | |||
pub fn camel_cased(snake_case: &str) -> String { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes! Finally! PascalCase! Errrr, I mean, "upper camel case". (On a more serious note, I think this fn
should be called snake_to_type_case
.)
Please note that all other case conversions we do are trivial conversions by choice, to not introduce surprisingly complex behavior.
The snake case version we get here are valid Postgres identifiers, right? What properties do they have in comparison to Rusts type identifiers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK on changing the method name.
Right now, this PR converts a Postgres type name or variant foo_bar
into FooBar
on the Rust side. There are bool arguments to the macro expansion methods to control whether this happens, but they aren't currently used because there is no macro-level interface for toggling them.
I think it makes sense to map to Rust-style naming, but I am willing to be convinced otherwise. If any name manging could happen, clearly documenting default behavior and providing an option to turn it off are also necessary.
#[test] | ||
fn camel_cased_i18n() { | ||
assert_eq!("Außerdem", camel_cased("außerdem")); | ||
assert_eq!("AuSSerdem", camel_cased("au_ßerdem")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
N̵͡i̛c͘e ̀e͜d̨͘͢g͜͡e͠͞ ̴̛c̶̨͞a͢͜s͞ȩ̵s.͟͠
Can enum variant names contain umlauts, though?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In Rust, it's an open issue.
In Postgres, it appears perfectly valid to have both types and variants contain umlauts. I'll have to dig deeper into the documentation to see if there are any surprises with regards to other unusual characters.
It seems like eventual convergence (arbitrary whitespace-free Unicode allowed) is an eventual goal, but it would probably be better to check explicitly that we're generating a legal type name and die with a reasonable explanation rather than letting rustc barf.
/// schema name. | ||
/// | ||
/// This macro can only be used in combination with the `diesel_codegen` or | ||
/// `diesel_codegen_syntex` crates. It will not work on its own. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a heads up: diesel_codegen_syntex will be removed in 0.10
@@ -1,4 +1,5 @@ | |||
#![deny(warnings)] | |||
#![recursion_limit = "512"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this needed for? (It's fine, just maybe add a comment what requires this, so we can remove it when it's no longer needed.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The quote!
macro makes use of (potentially deep) recursion, and my large invocations of it are running into the default limit. This number should be tuned to fit whatever this PR's eventual needs are; it may be that we wind up needing something smaller at first.
@@ -249,10 +249,20 @@ pub type VarChar = Text; | |||
#[cfg(feature = "postgres")] | |||
pub use pg::types::sql_types::*; | |||
|
|||
pub trait ProvidesSqlTypeFor<DB> where DB: TypeMetadata { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would love to see a doc comment here telling me what this type is all about. I assume it abstracts over structs and enums providing SQL types? Maybe also mention HasSqlType
is implemented for everything that implements this trait.
quote = "0.3.10" | ||
diesel = { version = "0.9.0", default-features = false } | ||
diesel_codegen_shared = { version = "0.9.0", default-features = false } | ||
|
||
[dependencies.syn] | ||
version = "0.10.3" | ||
features = ["full"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you use the same inline table syntax as above?
syn = { version = "0.10.3", features = ["full"] }
@@ -10,11 +10,15 @@ repository = "https://github.com/diesel-rs/diesel/tree/master/diesel_codegen" | |||
keywords = ["orm", "database", "postgres", "sql", "codegen"] | |||
|
|||
[dependencies] | |||
syn = "0.10.3" | |||
syntex_syntax = "0.52.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hope we can not introduce this dependency. Syntex takes ages to compile.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. I don't think this will be needed.
This can be broken up fairly easily; it is quite feasible to implement support simply for generating enum types to match what's in a database. I'll make changes to the PR that address the comments so far when I have more time this week. |
@@ -12,6 +12,22 @@ | |||
/// This macro can only be used in combination with the `diesel_codegen` or | |||
/// `diesel_codegen_syntex` crates. It will not work on its own. | |||
macro_rules! infer_schema { | |||
(extra_types_module = $extra_types_module: expr, $database_url: expr) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really don't like the solution of the user manually specifying extra_types_module
.
From a usability standpoint, it would be nicer to be able to infer_everything!
and just get tables, types, and (eventually) views from the database.
From a design standpoint, this is just kicking the problem down the field a short distance. As-is, all peers to SQL types live in diesel::types
, and library internals freely assume this. Looking up types that are not on a whitelist on a single user-specified alternate path works for now, but a few problems are readily evident:
- All DB-derived types are assumed to live in one namespace. In Postgres, you have the potential for multiple namespaces (which naturally arise from each distinct database connection string, or even from schemas nested inside of a database). This could lead to name clashes if, for example, identically named types are declared in different databases or schemas. (It seems like a good solution is to recommend strongly that different invocations of
infer_schema!
use differentextra_types_module
settings, but I'd prefer a cleaner design where remembering to do that sort of thing isn't necessary.) - The binding between a DB type and a Rust type is implicit (based on type name alone). This loose coupling could lead to bugs whenever there are name clashes.
- It is quite conceivable that a user will want to use types that come from disparate Rust namespaces, at which point they will have to do a lot of tedious and potentially confusing re-exporting with
pub use
(so that everything is in one namespace), or we will have to extend this mechanism to support having multiple extra type paths.
This could be solved by creating an intermediate representation of types in a database schema and mapping DB types to Rust types through that. This would require threading some sort of global table state through most of the code generation process, so the effects of adopting this design would be felt throughout a lot of Diesel. (Perhaps simply having a global shared structure would be adequate, since it will only be used at compilation time.)
Having a central, explicit set of data structures that describe types, tables, views, procedures, etc., could be quite useful. I suspect that some sort of compile-time metadata model would be very helpful for validating what is read from the database ("table has a primary key structure we don't support"), generating good error messages ("typename foobaz is not a valid Rust identifier"), debugging/testing (write tests against the intermediate representation code), streamlining code generation (all necessary state is in one obvious place), and error reporting (have richer local state from which to produce meaningful error messages).
This approach could go very wrong by being too top-heavy and prematurely putting straightjacketing Diesel. The project has gotten really far without building a top-down intermediate model of database components, and maybe it's best to keep it that way for now. It would be unpleasant to have to refactor a big metadata layer every time you want to support a new DB feature.
If it seems a good idea, I suggest working towards merging just the enum declaration part of this PR and opening a separate issue to discuss whether a metadata model is necessary or a good idea. That discussion may yield solutions to the problems I've described here which can be used to add DB type support to infer_schema!
without having to commit to a great deal more overhead.
Hi! |
I apologize for not having been on top of this. I may be able to put some more time into it this weekend. I still need to rebase on top of the latest Diesel release and restrict this PR's scope to consist only of generating types for Postgres enums. |
After looking at how far diesel has progressed since I wrote this PR, I have concluded that it must be rewritten. The basic approach (query Postgres catalog tables to get enum definitions) is still perfectly valid, but diesel's switch to using SQL schema standard means that the way this is done has to change. I am closing this PR and will move discussion back to #343. |
This PR provides the
infer_enums!
macro, which creates Rust enums for each enum discovered in a database, and extendsinfer_schema!
so that you can name columns .See #343 for discussion about how this has been developed up until now.
Items to be addressed before this is ready for merge:
infer_schema!
and related macros. (Argument-passing to indicate where to look up non-builtin types is rather unslightly.)