Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add hooks for selecting the set of files for a table scan; also add an option for empty string -> null conversion #68

Merged
merged 12 commits into from
Jul 29, 2015

Conversation

mbautin
Copy link

@mbautin mbautin commented Jul 26, 2015

@mbautin mbautin force-pushed the csd-1.4_file_selection_hooks branch from a6dd5a3 to 0adc99b Compare July 26, 2015 23:45
def setHadoopFileSelector(hadoopFileSelector: Option[HadoopFileSelector]): Unit = {
this.hadoopFileSelector = hadoopFileSelector
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add doc strings for these new public methods.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

@mbautin mbautin force-pushed the csd-1.4_file_selection_hooks branch from 6f39327 to dcbe683 Compare July 27, 2015 17:26

def setHadoopFileSelector(hadoopFileSelector: Option[HadoopFileSelector]): Unit = {
this.hadoopFileSelector = hadoopFileSelector
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider setHadoopFileSelector(hadoopFileSelector: HadoopFileSelector) and unsetHadoopFileSelector(): Unit { hadoopFileSelector = None }

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? One method means less code to write and maintain.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It just looks a little odd to me to set using an Option -- i.e. to setHadoopFileSelector(maybeAHadoopFileSelector) -- instead of to set with an actual instance and to explicitly clear instead of to set to None. I guess what I am saying is that it makes sense for the underlying this.hadoopFileSelector to be an Option (maybe there, maybe not), but that when setting or removing the hadoopFileSelector the caller of the method(s) would naturally have a concrete idea of what should be done and wrapping that concreteness in a maybe doesn't make obvious sense or improve the readability at the callsite of the set/unset.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@mbautin mbautin changed the title Add hooks for selecting the set of files for a table scan Add hooks for selecting the set of files for a table scan; also add an option for empty string -> null conversion Jul 29, 2015
} else {
(value: Any, row: MutableRow, ordinal: Int) =>
row.setString(ordinal, oi.getPrimitiveJavaObject(value).getValue)
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could also separate this into two cases, which may make the code maintenance with upstream changes a little easier.

case oi: HiveVarcharObjectInspector if emptyStringsAsNulls => ...
case oi: HiveVarcharObjectInspector =>
  (value: Any, row: MutableRow, ordinal: Int) =>
    row.setString(ordinal, oi.getPrimitiveJavaObject(value).getValue)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

markhamstra added a commit that referenced this pull request Jul 29, 2015
Add hooks for selecting the set of files for a table scan; also add an option for empty string -> null conversion
@markhamstra markhamstra merged commit 72be208 into alteryx:csd-1.4 Jul 29, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants