Skip to content

Commit

Permalink
Fix protobuf check jar script
Browse files Browse the repository at this point in the history
Remove config logging from spark plugin
  • Loading branch information
treff7es committed Jul 26, 2024
1 parent 1717a30 commit 242f846
Show file tree
Hide file tree
Showing 3 changed files with 11 additions and 7 deletions.
13 changes: 8 additions & 5 deletions metadata-integration/java/acryl-spark-lineage/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,15 +24,15 @@ When running jobs using spark-submit, the agent needs to be configured in the co

```text
#Configuring DataHub spark agent jar
spark.jars.packages io.acryl:acryl-spark-lineage:0.2.15
spark.jars.packages io.acryl:acryl-spark-lineage:0.2.16
spark.extraListeners datahub.spark.DatahubSparkListener
spark.datahub.rest.server http://localhost:8080
```

## spark-submit command line

```sh
spark-submit --packages io.acryl:acryl-spark-lineage:0.2.15 --conf "spark.extraListeners=datahub.spark.DatahubSparkListener" my_spark_job_to_run.py
spark-submit --packages io.acryl:acryl-spark-lineage:0.2.16 --conf "spark.extraListeners=datahub.spark.DatahubSparkListener" my_spark_job_to_run.py
```

### Configuration Instructions: Amazon EMR
Expand All @@ -41,7 +41,7 @@ Set the following spark-defaults configuration properties as it
stated [here](https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-spark-configure.html)

```text
spark.jars.packages io.acryl:acryl-spark-lineage:0.2.15
spark.jars.packages io.acryl:acryl-spark-lineage:0.2.16
spark.extraListeners datahub.spark.DatahubSparkListener
spark.datahub.rest.server https://your_datahub_host/gms
#If you have authentication set up then you also need to specify the Datahub access token
Expand All @@ -56,7 +56,7 @@ When running interactive jobs from a notebook, the listener can be configured wh
spark = SparkSession.builder
.master("spark://spark-master:7077")
.appName("test-application")
.config("spark.jars.packages", "io.acryl:acryl-spark-lineage:0.2.15")
.config("spark.jars.packages", "io.acryl:acryl-spark-lineage:0.2.16")
.config("spark.extraListeners", "datahub.spark.DatahubSparkListener")
.config("spark.datahub.rest.server", "http://localhost:8080")
.enableHiveSupport()
Expand All @@ -79,7 +79,7 @@ appName("test-application")
config("spark.master","spark://spark-master:7077")
.

config("spark.jars.packages","io.acryl:acryl-spark-lineage:0.2.13")
config("spark.jars.packages","io.acryl:acryl-spark-lineage:0.2.16")
.

config("spark.extraListeners","datahub.spark.DatahubSparkListener")
Expand Down Expand Up @@ -356,6 +356,9 @@ Use Java 8 to build the project. The project uses Gradle as the build tool. To b
+
## Changelog
### Version 0.2.16
- Remove logging DataHub config into logs
### Version 0.2.15
- Add Kafka emitter to emit lineage to kafka
- Add File emitter to emit lineage to file
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -257,7 +257,6 @@ private synchronized SparkLineageConf loadDatahubConfig(
this.appContext.setDatabricksTags(databricksTags.orElse(null));
}

log.info("Datahub configuration: {}", datahubConf.root().render());
Optional<DatahubEmitterConfig> emitterConfig = initializeEmitter(datahubConf);
SparkLineageConf sparkLineageConf =
SparkLineageConf.toSparkLineageConf(datahubConf, appContext, emitterConfig.orElse(null));
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,9 @@ jar -tvf $jarFile |\
grep -v "darwin" |\
grep -v "MetadataChangeProposal.avsc" |\
grep -v "aix" |\
grep -v "com/sun/"
grep -v "com/sun/" |\
grep -v "VersionInfo.java" |\
grep -v "mime.types"

if [ $? -ne 0 ]; then
echo "✅ No unexpected class paths found in ${jarFile}"
Expand Down

0 comments on commit 242f846

Please sign in to comment.