-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add data_block_encoding parameter to Phoenix connector. #4617
Conversation
if (value == null) { | ||
return Optional.empty(); | ||
} | ||
return Optional.of(value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use Optional.ofNullable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fixed all the occurrences here, as that just seemed right and no functional change at all.
@@ -109,6 +110,11 @@ public PhoenixTableProperties() | |||
TTL, | |||
"Number of seconds for cell TTL. HBase will automatically delete rows once the expiration time is reached.", | |||
null, | |||
false), | |||
stringProperty( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have a enum property for this like we define STORAGE_FORMAT_PROPERTY
for HiveTableProperties
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In both cases I just followed the existing code. I think if we change that, it is for all other parts too, and then we're mixing concerns of PRs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me know if you want me to fix all the other instances as well in this PR or file a separate one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be better if we could fix all the other instances also. Maybe we can have s separate PR for them ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fixed the Optional use below. The property list change should be a separate PR IMHO.
Also updated the documentation (including a fix for the Bloomfilter default, which is NONE instead of ROW). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on it
} | ||
|
||
public static Optional<Integer> getTimeToLive(Map<String, Object> tableProperties) | ||
{ | ||
requireNonNull(tableProperties); | ||
|
||
Integer value = (Integer) tableProperties.get(TTL); | ||
if (value == null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we capture them in a separate commit ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then let's do separate PRs. I'll undo this change, and just make sure the new code uses ofNullable.
What about the documentation fixes. Maybe it's best to separate the fix for the Bloomfilter doc as well.
OK... I changed it back to the minimum required to fix the issue at hand. Let's merge this one (assuming it's OK) and then I'll file followup PRs for the trivial fixes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this !! Looks good to me
false), | ||
stringProperty( | ||
DATA_BLOCK_ENCODING, | ||
"The block encoding algorithm to use for Cells in HBase blocks. Options are: PREFIX, DIFF, FAST_DIFF, ROW_INDEX_V1, and others.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did we miss None
and PREFIX_TREE
encoding technique ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add the same in the doc too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PREFIX_TREE is not finished in HBase, nobody should use it.
And there really is no reason to use PREFIX or DIFF (since FAST_DIFF is better and faster than both).
Yep. None is missing. And also, yes, it should consistent with the documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
I'll fix the test failure. Phoenix currently happens to set the HBase DATA_BLOCK_ENCODING to FAST_DIFF, but HBase has no default (it's NONE). |
Good to go, @Praveen2112? |
Merged !! Thanks for raising this PR |
Thanks @Praveen2112 |
DATA_BLOCK_ENCODING is a common parameter that Phoenix users might want to set (along with Compression), so we should add this.
(There are in fact a bunch of more HBase column family level options that Phoenix also supports, but most of them are not common, and should be probably set directly within Phoenix)