Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix case-sensitivity issue with views and column masks #24055

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

ksobolew
Copy link
Contributor

@ksobolew ksobolew commented Nov 7, 2024

Description

The table column reference was registered incorrectly with the original case taken from the view definition. It then failed to match the column schema returned from SystemAccessControl and the mask was not applied.

Instead of sprinkling toLowerCase() here and there, we will associate the original Field with the column mask and use Field#canResove to do the matching. The problem with this is that there's no way to do efficient lookups by name in a case-insensitive way, so we have to iterate the list of Field-Expression pairs to find a match.

Fixes #24054.

Additional context and related issues

I wonder how many more issues like that will we find...

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(x) Release notes are required, with the following suggested text:

## Section
* Fixed column masks not applying on views declaring non-lowercase column names. ({issue}`24054`)

@cla-bot cla-bot bot added the cla-signed label Nov 7, 2024
@ksobolew ksobolew requested review from martint and dain November 7, 2024 08:42
tableColumnReferences.put(field.getOriginTable().get(), field.getOriginColumnName().get());
tableColumnReferences.put(field.getOriginTable().get(), field.getOriginColumnName().get().toLowerCase(ENGLISH));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@martint I sprinkled those toLowerCase in enough places to fix this particular issue, but I'm not entirely sure if these are the right places. I think the convention is to keep the original case in the AST and only convert it as needed in the analyzer, even though it would probably be easier to fix all issues like this by storing converted names in AST. WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Identifiers need to be resolved according to standard sql rules. It differs for quoted vs non-quoted identifiers.

Currently, there are some inconsistencies in how they are being handled in various places, but we have a long standing issue to fix them. See #17

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood @martint. The question is, though, how far this should extend to interfaces like access control. This is especially relevant to what I call "negative grants" - column masks, row filters and DENY policies. If an access control declares a column mask on column foo, it should apply to foo and Foo but not to "Foo", if I understand correctly. One could argue that this is unexpected and may cause information leaks. To counter that someone could say that if the user uses case-sensitive names then they know what they are doing. But to counter that, some access controls coerce all names to lower case, assuming that they will be matched case-insensitively. So I am not sure what to do next.

I believe this needs to be fixed somehow right now, because at this moment this enables accidental information leaks, like described above, but with regular identifiers.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can fix this by associating masks to the corresponding Field in StatementAnalyzer and Analysis, instead of doing it by name. Then, in RelationPlanner, the lookup can be done directly by Field instead of relying on the name having the right case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@martint I pushed a prototype of a solution which (I think) implements your suggestion. There's no way to do efficient lookups of Fields, though, so in some places we need to do a linear scan of the list of masks in order to find a match.

@ksobolew ksobolew requested a review from kokosing November 7, 2024 08:48
@ksobolew ksobolew changed the title Fix case-sensitivity issue view views and column masks Fix case-sensitivity issue with views and column masks Nov 7, 2024
@ksobolew ksobolew force-pushed the kudi/column-mask-case-sensitivity branch from 230baaa to 0ede278 Compare November 7, 2024 09:14
The table column reference was registered incorectly with the original
case taken from the view definition. It then failed to match the column
schema returned from `SystemAccessControl` and the mask was not applied.

Instead of sprinkling `toLowerCase()` here and there, we will associate
the original `Field` with the column mask and use `Field#canResove` to
do the matching. The problem with this is that there's no way to do
efficient lookups by name in a case-insensitive way, so we have to
iterate the list of `Field`-`Expression` pairs to find a match.

Fixes trinodb#24054.
@ksobolew ksobolew force-pushed the kudi/column-mask-case-sensitivity branch from 0ede278 to 32dd720 Compare December 18, 2024 10:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

Column masks don't work when view declares columns with uppercase names
3 participants