-
Notifications
You must be signed in to change notification settings - Fork 349
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error while using spark-redshift jar #315
Comments
Which version of Spark are you using? If you're using 2.1.x then I suspect that changes to internal APIs may have broke |
Actually, looking a little more closely since this problem relates to Thus: are you using a newer version of |
I'm getting the same exception with a different stack trace and only when I switch from spark 2.0.1 to spark 2.1.0/hadoop 2.7/mesos/spark-redshift_2.11-2.0.1.jar/RedshiftJDBC41-1.1.17.1017.jar
|
I'm getting this error as well, with spark 2.1.0, I've also tried using the 3.0.0-preview1 of this library, previously was using 2.0.0.
Edit: Here's a bit bigger stack trace that may help.
|
@JoshRosen Any plans to make a new release soon? Seems like it's needed to use this with 2.1.0. |
@JoshRosen hit the same issue after upgrading from Spark 2.0.2 to Spark 2.1.0 our pipeline started throwing exceptions with the same cause
We are using spark-redsfhit 2.0.1 with https://s3.amazonaws.com/redshift-downloads/drivers/RedshiftJDBC41-1.1.17.1017.jar |
@elyast hit the same issue using spark 2.1.0. I make this question in Stackoverflow Using the version 2.0.2 of Spark you have the same issue? I'm not able to make the spark-redshift work in 2.0.2, if possible a help will be useful. |
found the root cause, spark 2.1 added new method to the interface: which is not implemented in spark-avro, hence AbstractMethodError |
Ran into the same issue with spark 2.1.0 , is there a work around (besides bumping the spark version down?). |
@apurva-sharma you can build this patch: databricks/spark-avro#206 and replace spark-avro dependency with this custom version, at least it worked for us |
@elyast thanks for that, I can verify that monkey patching spark-avro as above worked for me with spark 2.1.0 |
looks like spark-avro was fixed. any updates here? |
any updates when this issue will be fixed? |
Atm this driver is completely unusable ... |
Fixed mine by adding this line to sbt project build.sbt:
I am using spark-redshift 3 btw... Hopefully this library can be actively supported in the long run, it looks like it has not been updated for several months.... |
I've tried what @hnfmr suggests, but I am still running into this issue. |
@mrdmnd To be specific, I am using the Spark-Redshift v3.0.0-preview1 and my
BTW, I am using Spark 2.1.0... hope this helps |
@elyast Can you please describe what you did? My guess:
Thank you! |
Also seeing this issue here. @hnfmr's fix is working for me now, but it would be nice to have this properly fixed. Spark is a popular tool and Redshift usage is only going to grow. Exact workaround was to add the following to my build.sbt file:
|
Yeah, I had a minor typo. Can confirm that this works. |
I use Zeppelin to do ETL to redshift and encountered the same AbstractMethodError. By configuring the spark interpreter to exclude Thanks a lot! |
Yes! Just update or replace spark-avro_2.11-3.1.0.jar with spark-avro_2.11-3.2.0.jar and this problem should be solved now. https://mvnrepository.com/artifact/com.databricks/spark-avro_2.11/3.2.0 |
HI, I have got the same problem. java.lang.AbstractMethodError: org.apache.spark.sql.execution.datasources.OutputWriterFactory.getFileExtension(Lorg/apache/hadoop/mapreduce/T$ |
I have the same problem and I am using code in spark branch 2.2. Spark avro was spark-avro_2.11-3.2.0.jar already.
|
Any updates on this one? It seems that the underlying dependency (spark-avro_2.11-3.2.0) has resolved this issue. Instead of having everyone depend on the workaround, could the owner release a version that depends on the 3.2.0 of spark-avro? |
It seems this issue and repo are getting stale, would love to have this updated. @JoshRosen would it be possible to open this up to new contributors? |
Any updates on this? I'm using this through pyspark and am unable to try the work arounds suggested. |
Looks like this issue is going to be fixed in next version of spark-avro lib - databricks/spark-avro#242. It's merged to master 8 days ago |
Thanks for the hint on updating spark-avro dependency version. I resolved this issue with below spark-submit command in AWS EMR environment: |
I've updated to spark-avro 4.0.0 and I still have this issue. |
I have faced the same exception with spark_2.11 v2.1.1. |
I wouldn't expect this to be properly fixed, it seems databricks has decided to not update this library anymore outside of their own Databricks Runtime which as far as I can tell requires you to be using their entire platform, 717a4ad#diff-04c6e90faac2675aa89e2176d2eec7d8 |
@hnfmr Thanks for the hint. Even I faced the same issue while running ALS model in GCP and storing the output using com.databricks.spark.csv. Initially I was using com.databricks.spark.csv - 1.2.0 with Spark 2.2.0 and issue occurred. I've updated with latest version 1.5.0 and solved my issue. |
I was using older version of spark-redshift_2.11 changed to 3.0.0..preview and it started working |
If anyone is still having issues and wants to collab on this, I forked both this connector and spark-avro, we can get a working group around fixing these. I think this library & spark-avro are dead from an open-source continued support/contributor perspective (outside of the Databricks Runtime 717a4ad#diff-04c6e90faac2675aa89e2176d2eec7d8) as the last commits are 2+ years old aside from README updates. We would be required to adhere to the licensing (Apache 2.0), via NOTICE and some other means. Love the library @databricks but many people use this connector and library, I asked if this could be opened up to collaborators and didn't hear a response in over a year. It seems like it was quietly moved to closed source, which is understandable.. from a business perspective. Thanks for all the initial work on this, has helped a ton of people out to begin with and I personally have built many ETLs and data analysis tools from using this connector. Cheers, |
Hi,
Getting the below error while using the jar to integrate redshift with spark locally.
I find that prepareRead method is not in the RedshiftFileFormat.
Thanks & Regards,
Ravi
The text was updated successfully, but these errors were encountered: