-
Notifications
You must be signed in to change notification settings - Fork 14.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(chart-data-api): assert referenced columns are present in datasource #10451
fix(chart-data-api): assert referenced columns are present in datasource #10451
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM once CI passes
@willbarrett I'll add a few more assertions and improve the tests, should be done in a few days. |
23b59b7
to
24fd909
Compare
909549c
to
57aaed8
Compare
column_names = [ | ||
"created_on", | ||
"changed_on", | ||
"id", | ||
"start_dttm", | ||
"end_dttm", | ||
"layer_id", | ||
"short_descr", | ||
"long_descr", | ||
"json_metadata", | ||
"created_by_fk", | ||
"changed_by_fk", | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These had to be added, as AnnotationDatasource
s don't have any defined columns
.
payload["queries"][0]["groupby"] = ["currentDatabase()"] | ||
query_context = ChartDataQueryContextSchema().load(payload) | ||
query_payload = query_context.get_payload() | ||
assert query_payload[0].get("error") is not None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is assert
preferable to self.assertEqual
or self.assertIsNotNone
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is the pytest
way
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some doubts regarding how metrics are handled
table = self.get_table_by_name(table_name) | ||
payload = get_query_context(table.name, table.id, table.type) | ||
payload["queries"][0]["groupby"] = ["name"] | ||
payload["queries"][0]["metrics"] = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we able to deny an injected metric
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't been able to create one yet, would be interested to see if someone is able to do one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not easy that's for sure
table_name = "birth_names" | ||
table = self.get_table_by_name(table_name) | ||
payload = get_query_context(table.name, table.id, table.type) | ||
payload["queries"][0]["groupby"] = ["currentDatabase()"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If currentDatabase()
is a defined metric will it run ok?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It shouldn't, as it isn't an aggregate expression, hence will be missing from the groupby causing an invalid query.
Codecov Report
@@ Coverage Diff @@
## master #10451 +/- ##
==========================================
+ Coverage 59.49% 59.88% +0.38%
==========================================
Files 767 413 -354
Lines 36282 13433 -22849
Branches 3430 3430
==========================================
- Hits 21587 8044 -13543
+ Misses 14502 5196 -9306
Partials 193 193
Flags with carried forward coverage won't be shown. Click here to find out more. Continue to review full report at Codecov.
|
…rce (#10451) * fix(chart-data-api): assert requested columns are present in datasource * add filter tests * add column_names to AnnotationDatasource * add assertion for simple metrics * lint
…rce (apache#10451) * fix(chart-data-api): assert requested columns are present in datasource * add filter tests * add column_names to AnnotationDatasource * add assertion for simple metrics * lint
…rce (apache#10451) * fix(chart-data-api): assert requested columns are present in datasource * add filter tests * add column_names to AnnotationDatasource * add assertion for simple metrics * lint
…rce (apache#10451) * fix(chart-data-api): assert requested columns are present in datasource * add filter tests * add column_names to AnnotationDatasource * add assertion for simple metrics * lint
SUMMARY
Currently it might be possible to perform limited SQL injection via both the legacy and new chart data API by manipulating the
columns
,groupby
,filters
ormetrics
properties ofQueryObject
. This plugs that hole both on the new and legacy chart data API.Further work is required to ensure other fields aren't vulnerable to SQL injection. However, to keep the PR size manageable I suggest limiting the scope of this PR to fields with direct column references and adding more assertions later to the
where
andhaving
fields +SQL
type ad-hoc metrics.TEST PLAN
ADDITIONAL INFORMATION