-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add hidden $partition column to Hive connector #3582
Conversation
@@ -384,9 +385,12 @@ public int getIndex() | |||
} | |||
} | |||
else { | |||
String partitionKeyValues = String.join("/", partitionKeys.stream() | |||
.map(partitionKey -> format("%s=%s", partitionKey.getName(), partitionKey.getValue())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need some escaping of /
or =
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also wouln't MAP
be a better type here instead of VARCHAR
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think escaping or encoding is needed. Let me update it.
Exactly, MAP
type is also a candidate for this column.
@martint Do you have any opinion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there might be also be :
-escaping and canonicalization of representation.
but, since we know partition name before hand, why dont we just pass it here, instead of reconsructing?
did you consider this approach?
cc @electrum
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replaced with partitionName
provided by Hive split.
We may still need to add manual escaping in tests.
dcb8c53
to
57ce58c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM; minor comments.
Please rebase as there are conflicts.
assertEquals(results.getRowCount(), 9); | ||
|
||
assertUpdate("DROP TABLE test_partition_hidden_column"); | ||
assertFalse(getQueryRunner().tableExists(getSession(), "test_partition_hidden_column")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: drop this assertion
assertEquals(getPartitions("test_partition_hidden_column").size(), 9); | ||
|
||
MaterializedResult results = computeActual(format("SELECT *, \"%s\" FROM test_partition_hidden_column", PARTITION_COLUMN_NAME)); | ||
for (int i = 0; i < results.getRowCount(); i++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can drop index from loop. IMO this looks a bit cleaner:
for (MaterializedRow row : results.getMaterializedRows()) {
String actualPartition = (String) row.getField(3);
String expectedPartition = format("col1=%s/col2=%s", row.getField(1), row.getField(2));
assertEquals(actualPartition, expectedPartition);
}
ColumnMetadata columnMetadata = columnMetadatas.get(i); | ||
assertEquals(columnMetadata.getName(), columnNames.get(i)); | ||
if (columnMetadata.getName().equals(PARTITION_COLUMN_NAME)) { | ||
// $partition should be hidden column |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
drop comment
57ce58c
to
5b6eaa9
Compare
Initial support for #5