Spark 2.2.0 Support #242

squito · 2017-07-18T19:39:30Z

This adds support for spark 2.2.0. Primarily this is addressing the api change introduced by SPARK-19085 apache/spark@b3d3962. This fixes the issue in the most simplistic way: it copies the old conversion from InternalRow to Row. A better implementation would do something more efficient with InternalRow.

This keeps both write() methods, so it should be compatible with all spark 2+ versions.

It also fixes some dependency conflicts when running tests -- it seems curator has a conflicting version of guava with hadoop, but we don't actually need curator for tests.

Tested by running unit tests locally.

Fixes #240

squito · 2017-07-19T20:20:45Z

hmm, I guess travis configuration needs to be changed to build with jdk 8

…d to use java 1.7

codecov-io · 2017-07-19T21:11:58Z

Codecov Report

Merging #242 into master will increase coverage by 0.05%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master     #242      +/-   ##
==========================================
+ Coverage    90.4%   90.46%   +0.05%     
==========================================
  Files           5        5              
  Lines         323      325       +2     
  Branches       49       51       +2     
==========================================
+ Hits          292      294       +2     
  Misses         31       31

marcintustin · 2017-08-03T22:43:03Z

Would be wonderful if this could be merged

marcintustin · 2017-08-03T23:46:57Z

For now I'm running this with jitpack :(

rxin · 2017-08-04T02:59:57Z

It is currently failing codecov ...

ritesh-dineout · 2017-08-15T18:21:08Z

Is there any timeline by which this will be merged and this will be released? We started facing this issue in production, because of aws emr upgrading to spark 2.2.

squito · 2017-08-15T19:45:57Z

code coverage is now passing.

The added test is pretty lame to get code coverage. Probably the right thing to do is to start testing against spark 2.2, but that requires reconfiguring travis to use java 1.8, and I don't have permissions for that.

vanzin · 2017-08-15T19:51:23Z

Isn't all travis configuration in https://github.com/databricks/spark-avro/blob/branch-3.2/.travis.yml?

squito · 2017-08-15T20:09:49Z

oh good point. I actually meant codecoverage, but taking a closer look at the build process I guess that is automatic from travis. lets see what coverage I get from the latest changes

rxin · 2017-08-17T06:27:40Z

Alright thanks.

vara-bonthu · 2017-08-17T08:47:20Z

Just checking if this is expected to be released as 3.3.0 some time today? Thanks

tgravescs · 2017-08-28T18:18:47Z

Just curious when is 3.3 expected to release?

praneetsharma · 2017-08-29T05:06:33Z

Hi, I have a naive question on InternalRow to Row conversion. Since this conversion will happen for each row, can it result in performance degradation in case of millions of records when compared to spark-avro_3.2.1 (with Spark 2.1.0)?

febinsathar · 2017-09-25T05:52:48Z

when will 3.3 release? thank you

gengliangwang · 2017-10-30T23:31:09Z

@squito @marcintustin @ritesh-dineout @febinsathar spark-avro 4.0.0 is released :)

squito added 3 commits July 18, 2017 12:36

simple support for spark 2.2+ as well by copying old conversion logic

02df473

fix test dependencies

c442f5c

update java version & compatibility in README

0e3b17e

go back to building against spark 2.1 for travis, since its configure…

0c34822

…d to use java 1.7

gnmerritt mentioned this pull request Aug 9, 2017

spark-avro 3.2.0 doesn't work with spark 2.2.0 (abstract OutputWriter.write) #240

Closed

squito added 2 commits August 15, 2017 15:06

update README

00acd93

update travis test matrix

d62d006

squito force-pushed the spark_2.2.0_basic branch from 541093a to d62d006 Compare August 15, 2017 20:07

squito added 2 commits August 15, 2017 15:40

fix hadoop version for spark 2.2 tests

b6d22b4

cleanup inadvertent change

c70a9c1

rxin merged commit 204864b into databricks:master Aug 17, 2017

dnaumenko mentioned this pull request Aug 25, 2017

Error while using spark-redshift jar databricks/spark-redshift#315

Open

rxin mentioned this pull request Sep 1, 2017

Get ready for spark 2.2.0 #227

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Spark 2.2.0 Support #242

Spark 2.2.0 Support #242

squito commented Jul 18, 2017

squito commented Jul 19, 2017

codecov-io commented Jul 19, 2017 •

edited

Loading

marcintustin commented Aug 3, 2017

marcintustin commented Aug 3, 2017

rxin commented Aug 4, 2017

ritesh-dineout commented Aug 15, 2017

squito commented Aug 15, 2017

vanzin commented Aug 15, 2017

squito commented Aug 15, 2017

rxin commented Aug 17, 2017

vara-bonthu commented Aug 17, 2017

tgravescs commented Aug 28, 2017

praneetsharma commented Aug 29, 2017 •

edited

Loading

febinsathar commented Sep 25, 2017

gengliangwang commented Oct 30, 2017

Spark 2.2.0 Support #242

Spark 2.2.0 Support #242

Conversation

squito commented Jul 18, 2017

squito commented Jul 19, 2017

codecov-io commented Jul 19, 2017 • edited Loading

Codecov Report

marcintustin commented Aug 3, 2017

marcintustin commented Aug 3, 2017

rxin commented Aug 4, 2017

ritesh-dineout commented Aug 15, 2017

squito commented Aug 15, 2017

vanzin commented Aug 15, 2017

squito commented Aug 15, 2017

rxin commented Aug 17, 2017

vara-bonthu commented Aug 17, 2017

tgravescs commented Aug 28, 2017

praneetsharma commented Aug 29, 2017 • edited Loading

febinsathar commented Sep 25, 2017

gengliangwang commented Oct 30, 2017

codecov-io commented Jul 19, 2017 •

edited

Loading

praneetsharma commented Aug 29, 2017 •

edited

Loading