Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Can not use Avro over 1.8.2 with Beam 2.52.0 #29413

Closed
1 of 16 tasks
bvolpato opened this issue Nov 13, 2023 · 6 comments
Closed
1 of 16 tasks

[Bug]: Can not use Avro over 1.8.2 with Beam 2.52.0 #29413

bvolpato opened this issue Nov 13, 2023 · 6 comments

Comments

@bvolpato
Copy link
Contributor

What happened?

After #27851, user code that depends on versions newer than Avro 1.8.2 are having problems running on Dataflow.

For example in https://github.com/GoogleCloudPlatform/DataflowTemplates, where we moved on to Avro 1.11.3, there were incompatibility errors:

Caused by: java.io.InvalidClassException: org.apache.avro.specific.SpecificRecordBase; local class incompatible: stream classdesc serialVersionUID = -1463700717714793795, local class serialVersionUID = 189988654766568477

and

Caused by: java.lang.NoSuchMethodError: 'boolean org.apache.avro.generic.GenericRecord.hasField(java.lang.String)' com.google.cloud.teleport.v2.transforms.FormatDatastreamRecordToJson.getMetadataIsDeleted(FormatDatastreamRecordToJson.java:258) com.google.cloud.teleport.v2.transforms.FormatDatastreamRecordToJson.apply(FormatDatastreamRecordToJson.java:123) com.google.cloud.teleport.v2.transforms.FormatDatastreamRecordToJson.apply(FormatDatastreamRecordToJson.java:51) org.apache.beam.sdk.extensions.avro.io.AvroSource$AvroBlock.readNextRecord(AvroSource.java:610)

The root cause is that Avro classes are now being shipped along with the /opt/apache/beam/jars/beam-sdks-java-harness.jar, which wasn't the case before.

Tried to relocate in #29407 but got some test failures.

Next step is marking Avro as provided in that JAR, since it's apparently not used.

Issue Priority

Priority: 1 (data loss / total loss of function)

Issue Components

  • Component: Python SDK
  • Component: Java SDK
  • Component: Go SDK
  • Component: Typescript SDK
  • Component: IO connector
  • Component: Beam YAML
  • Component: Beam examples
  • Component: Beam playground
  • Component: Beam katas
  • Component: Website
  • Component: Spark Runner
  • Component: Flink Runner
  • Component: Samza Runner
  • Component: Twister2 Runner
  • Component: Hazelcast Jet Runner
  • Component: Google Cloud Dataflow Runner
@bvolpato
Copy link
Contributor Author

bvolpato commented Nov 13, 2023

@damccorm I would like to make this a blocker for 2.52.0 (sent an email to dev list https://lists.apache.org/thread/mryy2x4m3f0n1dy3k22fj03rom8x0pot)

@damccorm damccorm added this to the 2.52.0 Release milestone Nov 13, 2023
@aromanenko-dev
Copy link
Contributor

aromanenko-dev commented Nov 13, 2023

I'm wondering if there are any tests that are failing because of this?

#27851 was merged about a month ago and I didn't notice any complaints or issues on this despite that it was announced on dev@ ML.

@bvolpato
Copy link
Contributor Author

I think we should be able to introduce an integration test that relies on a newer Avro version. I can look into it.

The problem here is that for Beam, everything is 1.8.2, so there's no conflict. The issues happened when we tried to run user code that depended on 1.11.3 along with the harness JAR containing 1.8.2, so things ended up conflicting (in our case of Dataflow Runner, always failing because the runner adds the harness at the beginning of the java -cp).

@tvalentyn
Copy link
Contributor

tvalentyn commented Nov 15, 2023

Can this be closed or moved to next milestone?

@bvolpato
Copy link
Contributor Author

There's an AI to add testing that would have caught this regression -- it's no longer a blocker for 2.52.0 though

@tvalentyn tvalentyn removed this from the 2.52.0 Release milestone Nov 15, 2023
@damccorm damccorm added P2 and removed P1 labels Jan 22, 2024
@Abacn
Copy link
Contributor

Abacn commented Jul 1, 2024

obsolete as beam upgraded to avro 1.11.x

@Abacn Abacn closed this as not planned Won't fix, can't repro, duplicate, stale Jul 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants