Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-3965 [Java] JDBC-To-Arrow Configuration #3133

Closed
wants to merge 15 commits into from
Closed

ARROW-3965 [Java] JDBC-To-Arrow Configuration #3133

wants to merge 15 commits into from

Conversation

mikepigott
Copy link
Contributor

https://issues.apache.org/jira/browse/ARROW-3965

This creates an object which configures the BaseAllocator and Calendar used during to configure the translation from a JDBC ResultSet to an Arrow vector.

@mikepigott mikepigott changed the title JDBC-To-Arrow Configuration ARROW-3965 [Java] JDBC-To-Arrow Configuration Dec 9, 2018
@wesm
Copy link
Member

wesm commented Dec 14, 2018

@siddharthteotia @atuldambalkar can you review this?

* @param calendar the calendar to set.
* @exception NullPointerExeption if <code>calendar</code> is <code>null</code>.
*/
public JdbcToArrowConfig setCalendar(Calendar calendar) {
Copy link
Contributor

@praveenbingo praveenbingo Jan 29, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i would prefer a builder instead..so that this pojo is immutable..

*
* @return Whether this configuration is valid.
*/
public boolean isValid() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a builder would ensure this object is always valid after construction..reducing the need to validate it everywhere..

* @throws SQLException on error
*/
public static void jdbcToArrowVectors(ResultSet rs, VectorSchemaRoot root, Calendar calendar)
throws SQLException, IOException {

Preconditions.checkNotNull(rs, "JDBC ResultSet object can't be null");
Preconditions.checkNotNull(root, "JDBC ResultSet object can't be null");
Preconditions.checkNotNull(root, "Vector Schema cannot be null");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:)

@mikepigott
Copy link
Contributor Author

Thanks for the feedback! I'll try to get the changes to you by tomorrow.

@mikepigott
Copy link
Contributor Author

@praveenbingo - I just added the builder you requested. Please let me know if it looks like what you were expecting.

@codecov-io
Copy link

Codecov Report

Merging #3133 into master will increase coverage by 1.14%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3133      +/-   ##
==========================================
+ Coverage   88.03%   89.17%   +1.14%     
==========================================
  Files         635      410     -225     
  Lines       80173    51217   -28956     
  Branches     1069        0    -1069     
==========================================
- Hits        70582    45675   -24907     
+ Misses       9476     5542    -3934     
+ Partials      115        0     -115
Impacted Files Coverage Δ
cpp/src/arrow/csv/reader.h 0% <0%> (-100%) ⬇️
cpp/src/arrow/csv/reader.cc 0.53% <0%> (-90.38%) ⬇️
cpp/src/arrow/adapters/orc/adapter.cc 0.26% <0%> (-73.04%) ⬇️
cpp/src/plasma/eviction_policy.cc 59.01% <0%> (-40.99%) ⬇️
cpp/src/plasma/events.cc 52.5% <0%> (-35%) ⬇️
cpp/src/arrow/csv/options.h 66.66% <0%> (-33.34%) ⬇️
cpp/src/plasma/thirdparty/ae/ae.c 41.7% <0%> (-29.39%) ⬇️
cpp/src/parquet/column_scanner.cc 59.09% <0%> (-22.73%) ⬇️
cpp/src/arrow/vendored/xxhash/xxhash.c 51.26% <0%> (-22.34%) ⬇️
cpp/src/plasma/client.cc 77.99% <0%> (-17.55%) ⬇️
... and 273 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 072200f...be95426. Read the comment docs.

Copy link
Contributor

@praveenbingo praveenbingo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 LGTM. Thanks for doing this.

@praveenbingo
Copy link
Contributor

@wesm @siddharthteotia - could you please help Mike with merging this..

Copy link
Member

@wesm wesm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. Thanks @praveenbingo for reviewing

@wesm wesm closed this in 8e19532 Jan 30, 2019
xhochy pushed a commit that referenced this pull request Feb 6, 2019
https://issues.apache.org/jira/browse/ARROW-3966

This change includes #3133, and supports a new configuration item called "Include Metadata."  If true, metadata from the JDBC ResultSetMetaData object is pulled along to the Schema Field Metadata.  For now, this includes:
* Catalog Name
* Table Name
* Column Name
* Column Type Name

Author: Mike Pigott <mpigott@gmail.com>
Author: Michael Pigott <mikepigott@users.noreply.github.com>

Closes #3134 from mikepigott/jdbc-column-metadata and squashes the following commits:

02f2f34 <Mike Pigott> ARROW-3966: Picking up lost change to support null calendars.
7049c36 <Mike Pigott> Merge branch 'master' into jdbc-column-metadata
e9a9b2b <Michael Pigott> Merge pull request #6 from apache/master
65741a9 <Mike Pigott> ARROW-3966: Code review feedback
cc6cc88 <Mike Pigott> ARROW-3966: Using a 1:N loop instead of a 0:N-1 loop for fewer index offsets in code.
cfb2ba6 <Mike Pigott> ARROW-3966: Using a helper method for building a UTC calendar with root locale.
2928513 <Mike Pigott> ARROW-3966: Moving the metadata flag assignment into the builder.
69022c2 <Mike Pigott> ARROW-3966: Fixing merge.
4a6de86 <Mike Pigott> Merge branch 'master' into jdbc-column-metadata
509a1cc <Michael Pigott> Merge pull request #5 from apache/master
789c8c8 <Michael Pigott> Merge pull request #4 from apache/master
e5b19ee <Michael Pigott> Merge pull request #3 from apache/master
3b17c29 <Michael Pigott> Merge pull request #2 from apache/master
d847ebc <Mike Pigott> Fixing file location
1ceac9e <Mike Pigott> Merge branch 'master' into jdbc-column-metadata
881c6c8 <Michael Pigott> Merge pull request #1 from apache/master
03091a8 <Mike Pigott> Unit tests for including result set metadata.
72d64cc <Mike Pigott> Affirming the field metadata is empty when the configuration excludes field metadata.
7b4527c <Mike Pigott> Test for the include-metadata flag in the configuration.
7e9ce37 <Mike Pigott> Merge branch 'jdbc-to-arrow-config' into jdbc-column-metadata
bb3165b <Mike Pigott> Updating the function calls to use the JdbcToArrowConfig versions.
a6fb1be <Mike Pigott> Fixing function call
5bfd6a2 <Mike Pigott> Merge branch 'jdbc-to-arrow-config' into jdbc-column-metadata
68c91e7 <Mike Pigott> Modifying the jdbcToArrowSchema and jdbcToArrowVectors methods to receive JdbcToArrowConfig objects.
b5b0cb1 <Mike Pigott> Merge branch 'jdbc-to-arrow-config' into jdbc-column-metadata
8d6cf00 <Mike Pigott> Documentation for public static VectorSchemaRoot sqlToArrow(Connection connection, String query, JdbcToArrowConfig config)
4f1260c <Mike Pigott> Adding documentation for public static VectorSchemaRoot sqlToArrow(ResultSet resultSet, JdbcToArrowConfig config)
e34a9e7 <Mike Pigott> Fixing formatting.
fe097c8 <Mike Pigott> Merge branch 'jdbc-to-arrow-config' into jdbc-column-metadata
df632e3 <Mike Pigott> Updating the SQL tests to include JdbcToArrowConfig versions.
b270044 <Mike Pigott> Updated validaton & documentation, and unit tests for the new JdbcToArrowConfig.
da77cbe <Mike Pigott> Creating a configuration class for the JDBC-to-Arrow converter.
a78c770 <Mike Pigott> Updating Javadocs.
523387f <Mike Pigott> Updating the API to support an optional 'includeMetadata' field.
5af1b5b <Mike Pigott> Separating out the field-type creation from the field creation.
xhochy pushed a commit that referenced this pull request Feb 8, 2019
https://issues.apache.org/jira/browse/ARROW-3966

This change includes #3133, and supports a new configuration item called "Include Metadata."  If true, metadata from the JDBC ResultSetMetaData object is pulled along to the Schema Field Metadata.  For now, this includes:
* Catalog Name
* Table Name
* Column Name
* Column Type Name

Author: Mike Pigott <mpigott@gmail.com>
Author: Michael Pigott <mikepigott@users.noreply.github.com>

Closes #3134 from mikepigott/jdbc-column-metadata and squashes the following commits:

02f2f34 <Mike Pigott> ARROW-3966: Picking up lost change to support null calendars.
7049c36 <Mike Pigott> Merge branch 'master' into jdbc-column-metadata
e9a9b2b <Michael Pigott> Merge pull request #6 from apache/master
65741a9 <Mike Pigott> ARROW-3966: Code review feedback
cc6cc88 <Mike Pigott> ARROW-3966: Using a 1:N loop instead of a 0:N-1 loop for fewer index offsets in code.
cfb2ba6 <Mike Pigott> ARROW-3966: Using a helper method for building a UTC calendar with root locale.
2928513 <Mike Pigott> ARROW-3966: Moving the metadata flag assignment into the builder.
69022c2 <Mike Pigott> ARROW-3966: Fixing merge.
4a6de86 <Mike Pigott> Merge branch 'master' into jdbc-column-metadata
509a1cc <Michael Pigott> Merge pull request #5 from apache/master
789c8c8 <Michael Pigott> Merge pull request #4 from apache/master
e5b19ee <Michael Pigott> Merge pull request #3 from apache/master
3b17c29 <Michael Pigott> Merge pull request #2 from apache/master
d847ebc <Mike Pigott> Fixing file location
1ceac9e <Mike Pigott> Merge branch 'master' into jdbc-column-metadata
881c6c8 <Michael Pigott> Merge pull request #1 from apache/master
03091a8 <Mike Pigott> Unit tests for including result set metadata.
72d64cc <Mike Pigott> Affirming the field metadata is empty when the configuration excludes field metadata.
7b4527c <Mike Pigott> Test for the include-metadata flag in the configuration.
7e9ce37 <Mike Pigott> Merge branch 'jdbc-to-arrow-config' into jdbc-column-metadata
bb3165b <Mike Pigott> Updating the function calls to use the JdbcToArrowConfig versions.
a6fb1be <Mike Pigott> Fixing function call
5bfd6a2 <Mike Pigott> Merge branch 'jdbc-to-arrow-config' into jdbc-column-metadata
68c91e7 <Mike Pigott> Modifying the jdbcToArrowSchema and jdbcToArrowVectors methods to receive JdbcToArrowConfig objects.
b5b0cb1 <Mike Pigott> Merge branch 'jdbc-to-arrow-config' into jdbc-column-metadata
8d6cf00 <Mike Pigott> Documentation for public static VectorSchemaRoot sqlToArrow(Connection connection, String query, JdbcToArrowConfig config)
4f1260c <Mike Pigott> Adding documentation for public static VectorSchemaRoot sqlToArrow(ResultSet resultSet, JdbcToArrowConfig config)
e34a9e7 <Mike Pigott> Fixing formatting.
fe097c8 <Mike Pigott> Merge branch 'jdbc-to-arrow-config' into jdbc-column-metadata
df632e3 <Mike Pigott> Updating the SQL tests to include JdbcToArrowConfig versions.
b270044 <Mike Pigott> Updated validaton & documentation, and unit tests for the new JdbcToArrowConfig.
da77cbe <Mike Pigott> Creating a configuration class for the JDBC-to-Arrow converter.
a78c770 <Mike Pigott> Updating Javadocs.
523387f <Mike Pigott> Updating the API to support an optional 'includeMetadata' field.
5af1b5b <Mike Pigott> Separating out the field-type creation from the field creation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants