Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check Constraint for Discriminator columns #20880

Closed
henrikdahl8240 opened this issue May 7, 2020 · 14 comments
Closed

Check Constraint for Discriminator columns #20880

henrikdahl8240 opened this issue May 7, 2020 · 14 comments

Comments

@henrikdahl8240
Copy link

It's very good with the newish Check Constraints for enum mappings, i.e. https://docs.microsoft.com/en-us/ef/core/what-is-new/ef-core-5.0/whatsnew#generation-of-check-constraints-for-enum-mappings.

Isn't it fair to say, that the same kind of check constraint should be made for discriminator columns?

@smitpatel
Copy link
Member

This is only relevant if discriminator mapping is marked as complete.

@henrikdahl8240
Copy link
Author

Where can I read about the issue of completion because I cannot find it here: https://docs.microsoft.com/en-us/ef/core/modeling/inheritance.

Does it mean, that the list of discriminator values is finite and this is achieved by invoking IsComplete() on DiscriminatorBuilder()?

@henrikdahl8240
Copy link
Author

@smitpatel Thank you very much.

Concerning

"EF Core 5.0 will now generate the following when a complete discriminator mapping is configured:"

This you achieve simply by invoking IsComplete(), which I don't think that it reads there?

@ajcvickers
Copy link
Member

We discussed this as a team and came to the conclusion that this type of constraint in the database is not very useful if you only use EF to access the database since EF will not generate SQL that violates the constraint. This is also true for enum check constraints. Therefore, we are going to move to a model where these things are plugins that can easily be added if if needed, but are not the default behavior when generating a database with EF. See #20897 and efcore/EFCore.CheckConstraints#1 for enum constraints. Check constraints for discriminator columns should follow the same approach.

@ajcvickers ajcvickers added this to the Backlog milestone May 9, 2020
@henrikdahl8240
Copy link
Author

It's also perfectly reasonable. I think we can say, that generally there are two schools, either the loose attitude approach, which you are an advocate for, or the strict attitude, where you carefully check everything at each abstraction level and obviously do it in the most performant way.

As I suggested #20868 it was exactly because I didn't want to face a performance penalty at each insert and update due to an inefficient implementation of the approach of strict attitude.

I think you could also say, that you obviously want support for SPARSE columns because the TPH (Table per Hierarchy) inheritance obviously favors that columns will be sparse by nature, because columns used for more derived types will be sparse at more superior as well as sibling levels. I think it could be quite reasonable if you could choose that EF Core would automatically mark such columns sparse because otherwise you will just waste resources maintaining a lot of explicit NULLs.

Likewise you can also say, that you obviously often want an index for the discriminator column because otherwise it will be expensive to find instances of a given type, for instance using OfType. I think this is quite analogous to the fact, that you often want an index for a foreign key column.

@roji
Copy link
Member

roji commented May 9, 2020

@henrikdahl8240 thanks for your thoughts, that makes a lot of sense.

The suggestion for SPARSE columns is definitely interesting, and it's true that this is a particularly good fit for TPH. I still wouldn't think EF Core should necessarily make any decisions here implicitly, since the actual sparseness is ultimately data-bound (how many actual rows exist without that column). However, it does seem valuable to provide users a way to configure this easily. FWIW adding this should be really quite simple (#8023).

@roji
Copy link
Member

roji commented May 9, 2020

BTW @henrikdahl8240 do you mind reopening this issue on https://github.com/efcore/EFCore.CheckConstraints?

@henrikdahl8240
Copy link
Author

henrikdahl8240 commented May 9, 2020

@roji Therefore I wrote

you

in

I think it could be quite reasonable if you could choose that EF Core would automatically mark such columns sparse

Because only "you" can know the distribution as the decision should be done statically. My point was, that you should be able to explicitly opt-in for this automatic production of SPARSE annotations. If you do not opt-in, no SPARSE annotation should be produced.

Basically the taller and wider your TPH hierarchy is, the more the need for SPARSE will generally be. The same goes for the need, that indexes on these values will be filtered because otherwise they will just be flooded by NULLs. The conceptual issue obviously is, that you want the property of this specific type to be indexed and not have it to be flooded by NULL "garbage" orginating from the other types in the hierarchy.

You should also be able to opt-in for producing DATA_COMPRESSION=PAGE for a table because if you know you'll mainly be supporting western languages you will be able to save huge amounts of space (half typically). As far as I remember, it's not currently offered by EF Core.

In my honest opinion EF Core has several short comings in these aspects of product maturation, which I find a great pity. If you are skilled in computer science and have a broader field of view, I would say all these matters are rather easy to observe the need for.

Yes, I know, we are getting a bit outside the scope of the subject header now.

@henrikdahl8240
Copy link
Author

@roji I am sorry, but what do you mean by

BTW @henrikdahl8240 do you mind reopening this issue on https://github.com/efcore/EFCore.CheckConstraints?

@roji
Copy link
Member

roji commented May 9, 2020

@henrikdahl8240 your points are well taken, but as a general rule when something is data-bound (how many actual rows exist of each type in the hierarchy), we have to be careful. Are you proposing that after an opt-in, all TPH columns (for all inherited types) be flagged as SPARSE? If your data happens to be dominated by a subtype in the hierarchy, that may actually cause a degradation. This is why we tend to avoid making automatic decisions whose effects depend on actual data. The same holds for filtered indexes - at least users can already specify that themselves. But this is all definitely worth considering.

You should also be able to opt-in for producing DATA_COMPRESSION=PAGE for a table because if you know you'll mainly be supporting western languages you will be able to save huge amounts of space (half typically). As far as I remember, it's not currently offered by EF Core.

Can you please open an issue for that (in this repo)?

I am sorry, but what do you mean by [...]

As @ajcvickers wrote above, although check constraints are a good idea, we'd rather they existed in an external plugin, as an opt-in. I've already setup https://github.com/efcore/EFCore.CheckConstraints with the enum check constraints, and I think that would be the right home for this feature as well. So I'm proposing to move this request to that repo and to close it here.

@henrikdahl8240
Copy link
Author

henrikdahl8240 commented May 9, 2020

@roji I generally assume, that I as a computer scientist know, what I am doing, what I want to do and how it should be done and it should just be efficient to do, so I can be as productive as possible and therefore earn as much as possible.

@roji No, I would rather say, that I should be able to do something like:
builder.Property(t => t.PropertyInQuestion).AsSparse()

It means an AsSparse() method on PropertyBuilder.
`
In this way I can very easily choose which columns I want to be SPARSE. In real life there can easily be an 80/20 rule, i.e. that 80% of the instances are of a given type. For such a type I obviously generally don't want it's columns to be marked with SPARSE. So it's generally a consideration which should be done per column.

How do you actually filter an index by a type in the hierarchy? Let's assume we have class Mammal with two sub classes Dog and Cat and Dog has property P how do you then make an index on P to be filtered so only Dog rows' P column make it to the index?

@roji
Copy link
Member

roji commented May 9, 2020

Re sparse columns, it seems like we're saying exactly the same thing. Allowing users to specify sparseness on a property is tracked by #8023 - it's quite an easy issue if you want to take a stab at it.

Re the filtered index, for TPH we're only dealing with a single, simple table and the index isn't aware of the hierarchy. So you could simply set up an index on P, with a filter only for rows whose discriminator is Dog's. This is quite straightforward and already fully supported (see the docs).

@roji
Copy link
Member

roji commented May 11, 2020

Closing in favor of efcore/EFCore.CheckConstraints#2

@roji roji closed this as completed May 11, 2020
@roji roji removed this from the Backlog milestone May 11, 2020
@ajcvickers ajcvickers reopened this Oct 16, 2022
@ajcvickers ajcvickers closed this as not planned Won't fix, can't repro, duplicate, stale Oct 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants