Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix BigQuery tests for column names with special characters #16613

Merged

Conversation

ssheikin
Copy link
Contributor

@ssheikin ssheikin commented Mar 17, 2023

Some column names started to fail because of BigQuery internal change
~ between
18:00 16 of March 2023 GMT and
00:00 17 of March 2023 GMT,
first noticed when 90d6f07 merged to
master.

Description

Fixes #16603

Release notes

(x) This is not user-visible or docs only and no release notes are required.

@@ -13,6 +13,7 @@
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit lost. Are we testing bigquery here? @ebyhr can you please let me know what is the point of test based on incorrect column names?

As I understand we should only update filterColumnNameTestData

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Testing unsupported column names helps us find SQL injection vector.

@Override
protected Optional<String> filterColumnNameTestData(String columnName)
{
if (NOT_ALLOWED_ANYMORE_COLUMN_NAMES.contains(columnName)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Creating and inserting into tables with these columns works and selecting the table isn't problem, right? This change hides the fact and we can't understand what happened after the operations. It would be better to deny such columns in all operations, or verify the result with different way, e.g. use query based API instead of storage API.

@@ -13,6 +13,7 @@
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Testing unsupported column names helps us find SQL injection vector.

@ssheikin ssheikin force-pushed the ssheikin/29/oss/bigquery_column_names branch 4 times, most recently from 58c7ac8 to 5d17c67 Compare March 20, 2023 23:36
@ssheikin
Copy link
Contributor Author

@ebyhr hope I've addressed your comments.
I think this PR is good enough to fix failing tests. However the additional work required to be able to SELECT from such columns again.
Please review.


// these column names started to fail because of BigQuery internal change ~ between 18:00 16 of March 2023 GMT and 00:00 17 of March 2023 GMT
// it's still possible to CREATE TABLE with such column names, and even INSERT some data, but it is not possible to SELECT from it.
// TODO is it possible to improve trino to be able to SELECT from such columns again?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a good question. But maybe we should prevent creating columns with names containing the "prohibited" characters to avoid this gotcha?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's handle in follow-up #16664. We would like to fix CI failure first.

Some column names started to fail because of BigQuery internal change
~ between
18:00 16 of March 2023 GMT and
00:00 17 of March 2023 GMT,
first noticed when 90d6f07 merged to
master.
@ebyhr ebyhr force-pushed the ssheikin/29/oss/bigquery_column_names branch from 5d17c67 to dd467dd Compare March 22, 2023 03:16
@ebyhr ebyhr merged commit 52e3884 into trinodb:master Mar 22, 2023
@github-actions github-actions bot added this to the 411 milestone Mar 22, 2023
@ssheikin
Copy link
Contributor Author

@ebyhr Thank you for improving and merging

@ssheikin ssheikin deleted the ssheikin/29/oss/bigquery_column_names branch April 12, 2024 09:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bigquery BigQuery connector cla-signed
4 participants