-
Notifications
You must be signed in to change notification settings - Fork 260
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Spark job with 8.0 maven jar fails with error commons-logging:commons-logging download failed. #705
Comments
Assigning to @Yuqing-cat as she is working with Azure Support on this. |
Hello Please provide some solution to get overcome from the error. |
Hi @Uditanshu1612 , |
Hi @blrchen Can you please tell me How you run the notebook of POC of Feathr Feature store local in your pyspark ? |
It looks like a config format problem. Could you share your config file (with sensitive info removed) so that I can help to check the root cause? |
And local spark provider is introduced here: https://feathr-ai.github.io/feathr/how-to-guides/local-spark-provider.html |
This is a config file I'm using and I'm running it in local pyspark but it says only Synapse and Databricks is supported while creating a FeathrClient object. |
import tempfile |
I'm not giving the configuration for S3 and Snowflake because I'm using SQL DB as an Offline store |
Thanks for your info @Uditanshu1612.
Your config seems correct. Please double check the indent. And you could also print out the To improve user experience on this, a PR is submitted to refine the error message: #755. You feedback is important to us :) |
Thank-you @Yuqing-cat, I'll try this and if I face any issue I'll ping here. |
Hello @Yuqing-cat Currently My azure-core is a version of 1.26.0 so while executing the module using this version of azure-core it throws me an error. So here by observing the error I get to know that module requires an azure-core of version <=1.22.1. So, if I downgrade the version to 1.22.1 or less than that, then I'm facing an error. How to resolve the error and successfully import the module? |
Hi @Uditanshu1612 , the azure-core dependency has some known issue that may fail in certain environment, e.g. AML or Ubuntu VM. We have separate PRs to make is more robust in different platform, like #763. I highly recommend you to join the Feathr slack channel where you will get more instant help and updated with the latest notification: https://join.slack.com/t/feathrai/shared_invite/zt-1hy8m4def-w8w6SYNFxvTAuuihTvohVw |
Closing as fixed in 0.9.0 |
Willingness to contribute
Yes. I can contribute a fix for this bug independently.
Feathr version
0.8.0
System information
Describe the problem
Running nyc driver sample notebook with 0.8.0 maven jar, spark job will fails DRIVER_LIBRARY_INSTALLATION_FAILURE error.
Note: this error only happens on Azure Databricks. Using maven jar on Synpase or local pyspark is working fine.
This failure is due to Databricks runtime pre-built package has conflict with elastic search dependencies.
To get it work, users will need to exclude packages in Databricks like below:
commons-logging:commons-logging,org.slf4j:slf4j-api,com.google.protobuf:protobuf-java,javax.xml.bind:jaxb-api
Workaround
In the YAML config, add a line for
feathr_runtime_location
, this will make spark cluster use runtime jars from Azure Storage, see the latest line in following exampleTracking information
Code to reproduce bug
No response
What component(s) does this bug affect?
Python Client
: This is the client users use to interact with most of our API. Mostly written in Python.Computation Engine
: The computation engine that execute the actual feature join and generation work. Mostly in Scala and Spark.Feature Registry API
: The frontend API layer supports SQL, Purview(Atlas) as storage. The API layer is in Python(FAST API)Feature Registry Web UI
: The Web UI for feature registry. Written in ReactThe text was updated successfully, but these errors were encountered: