-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add hooks for selecting the set of files for a table scan; also add an option for empty string -> null conversion #68
Add hooks for selecting the set of files for a table scan; also add an option for empty string -> null conversion #68
Conversation
a6dd5a3
to
0adc99b
Compare
including a filesystem prefix.
def setHadoopFileSelector(hadoopFileSelector: Option[HadoopFileSelector]): Unit = { | ||
this.hadoopFileSelector = hadoopFileSelector | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add doc strings for these new public methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added.
6f39327
to
dcbe683
Compare
|
||
def setHadoopFileSelector(hadoopFileSelector: Option[HadoopFileSelector]): Unit = { | ||
this.hadoopFileSelector = hadoopFileSelector | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consider setHadoopFileSelector(hadoopFileSelector: HadoopFileSelector)
and unsetHadoopFileSelector(): Unit { hadoopFileSelector = None }
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why? One method means less code to write and maintain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It just looks a little odd to me to set
using an Option
-- i.e. to setHadoopFileSelector(maybeAHadoopFileSelector)
-- instead of to set
with an actual instance and to explicitly clear instead of to set to None. I guess what I am saying is that it makes sense for the underlying this.hadoopFileSelector
to be an Option (maybe there, maybe not), but that when setting or removing the hadoopFileSelector
the caller of the method(s) would naturally have a concrete idea of what should be done and wrapping that concreteness in a maybe
doesn't make obvious sense or improve the readability at the callsite of the set/unset.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
} else { | ||
(value: Any, row: MutableRow, ordinal: Int) => | ||
row.setString(ordinal, oi.getPrimitiveJavaObject(value).getValue) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could also separate this into two case
s, which may make the code maintenance with upstream changes a little easier.
case oi: HiveVarcharObjectInspector if emptyStringsAsNulls => ...
case oi: HiveVarcharObjectInspector =>
(value: Any, row: MutableRow, ordinal: Int) =>
row.setString(ordinal, oi.getPrimitiveJavaObject(value).getValue)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
Add hooks for selecting the set of files for a table scan; also add an option for empty string -> null conversion
@markhamstra