Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Cassandra spark-dependencies seems to be broken #2508

Open
iblancasa opened this issue Mar 15, 2024 · 7 comments
Open

[Bug]: Cassandra spark-dependencies seems to be broken #2508

iblancasa opened this issue Mar 15, 2024 · 7 comments
Labels
bug Something isn't working

Comments

@iblancasa
Copy link
Collaborator

What happened?

When running the cassandra-spark E2E test, the pod from the spark job fails:

k logs test-spark-deps-spark-dependencies-28508319-z4cmv
WARNING: sun.reflect.Reflection.getCallerClass is not supported. This will impact performance.
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.apache.spark.unsafe.Platform (file:/app/jaeger-spark-dependencies-0.0.1-SNAPSHOT.jar) to constructor java.nio.DirectByteBuffer(long,int)
WARNING: Please consider reporting this to the maintainers of org.apache.spark.unsafe.Platform
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
Exception in thread "main" java.io.IOException: Failed to open native connection to Cassandra at {10.244.0.19}:9042
	at com.datastax.spark.connector.cql.CassandraConnector$.createSession(CassandraConnector.scala:168)
	at com.datastax.spark.connector.cql.CassandraConnector$.$anonfun$sessionCache$1(CassandraConnector.scala:154)
	at com.datastax.spark.connector.cql.RefCountedCache.createNewValueAndKeys(RefCountedCache.scala:32)
	at com.datastax.spark.connector.cql.RefCountedCache.syncAcquire(RefCountedCache.scala:69)
	at com.datastax.spark.connector.cql.RefCountedCache.acquire(RefCountedCache.scala:57)
	at com.datastax.spark.connector.cql.CassandraConnector.openSession(CassandraConnector.scala:79)
	at com.datastax.spark.connector.cql.CassandraConnector.withSessionDo(CassandraConnector.scala:111)
	at com.datastax.spark.connector.cql.CassandraConnector.withClusterDo(CassandraConnector.scala:122)
	at com.datastax.spark.connector.cql.Schema$.fromCassandra(Schema.scala:332)
	at com.datastax.spark.connector.cql.Schema$.tableFromCassandra(Schema.scala:352)
	at com.datastax.spark.connector.rdd.CassandraTableRowReaderProvider.tableDef(CassandraTableRowReaderProvider.scala:50)
	at com.datastax.spark.connector.rdd.CassandraTableRowReaderProvider.tableDef$(CassandraTableRowReaderProvider.scala:50)
	at com.datastax.spark.connector.rdd.CassandraTableScanRDD.tableDef$lzycompute(CassandraTableScanRDD.scala:63)
	at com.datastax.spark.connector.rdd.CassandraTableScanRDD.tableDef(CassandraTableScanRDD.scala:63)
	at com.datastax.spark.connector.rdd.CassandraTableRowReaderProvider.verify(CassandraTableRowReaderProvider.scala:137)
	at com.datastax.spark.connector.rdd.CassandraTableRowReaderProvider.verify$(CassandraTableRowReaderProvider.scala:136)
	at com.datastax.spark.connector.rdd.CassandraTableScanRDD.verify(CassandraTableScanRDD.scala:63)
	at com.datastax.spark.connector.rdd.CassandraTableScanRDD.getPartitions(CassandraTableScanRDD.scala:263)
	at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:294)
	at scala.Option.getOrElse(Option.scala:189)
	at org.apache.spark.rdd.RDD.partitions(RDD.scala:290)
	at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49)
	at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:294)
	at scala.Option.getOrElse(Option.scala:189)
	at org.apache.spark.rdd.RDD.partitions(RDD.scala:290)
	at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49)
	at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:294)
	at scala.Option.getOrElse(Option.scala:189)
	at org.apache.spark.rdd.RDD.partitions(RDD.scala:290)
	at org.apache.spark.Partitioner$.$anonfun$defaultPartitioner$4(Partitioner.scala:78)
	at org.apache.spark.Partitioner$.$anonfun$defaultPartitioner$4$adapted(Partitioner.scala:78)
	at scala.collection.immutable.List.map(List.scala:293)
	at org.apache.spark.Partitioner$.defaultPartitioner(Partitioner.scala:78)
	at org.apache.spark.rdd.PairRDDFunctions.$anonfun$groupByKey$6(PairRDDFunctions.scala:636)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
	at org.apache.spark.rdd.RDD.withScope(RDD.scala:410)
	at org.apache.spark.rdd.PairRDDFunctions.groupByKey(PairRDDFunctions.scala:636)
	at org.apache.spark.api.java.JavaPairRDD.groupByKey(JavaPairRDD.scala:561)
	at io.jaegertracing.spark.dependencies.cassandra.CassandraDependenciesJob.run(CassandraDependenciesJob.java:169)
	at io.jaegertracing.spark.dependencies.DependenciesSparkJob.run(DependenciesSparkJob.java:60)
	at io.jaegertracing.spark.dependencies.DependenciesSparkJob.main(DependenciesSparkJob.java:40)
Caused by: java.lang.NoClassDefFoundError: com/codahale/metrics/JmxReporter
	at com.datastax.driver.core.Metrics.<init>(Metrics.java:146)
	at com.datastax.driver.core.Cluster$Manager.init(Cluster.java:1501)
	at com.datastax.driver.core.Cluster.getMetadata(Cluster.java:451)
	at com.datastax.spark.connector.cql.CassandraConnector$.createSession(CassandraConnector.scala:161)
	... 41 more
Caused by: java.lang.ClassNotFoundException: com.codahale.metrics.JmxReporter
	at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(Unknown Source)
	at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(Unknown Source)
	at java.base/java.lang.ClassLoader.loadClass(Unknown Source)
	... 45 more

Steps to reproduce

Run the test

Expected behavior

.

Relevant log output

No response

Screenshot

No response

Additional context

No response

Jaeger backend version

No response

SDK

No response

Pipeline

No response

Stogage backend

No response

Operating system

No response

Deployment model

No response

Deployment configs

No response

@iblancasa iblancasa added the bug Something isn't working label Mar 15, 2024
@iblancasa iblancasa changed the title [Bug]: [Bug]: Cassandra spark-dependencies seems to be broken Mar 15, 2024
iblancasa added a commit to iblancasa/jaeger-operator that referenced this issue Mar 15, 2024
Signed-off-by: Israel Blancas <iblancasa@gmail.com>
@iblancasa
Copy link
Collaborator Author

It seems something changed recently in the image and that is breaking the operator integration

rubenvp8510 pushed a commit that referenced this issue Mar 16, 2024
* Fix the CI. Change the implementation of the method to make it easy to test

Signed-off-by: Israel Blancas <iblancasa@gmail.com>

* Fix E2E

Signed-off-by: Israel Blancas <iblancasa@gmail.com>

* Disable test #2508

Signed-off-by: Israel Blancas <iblancasa@gmail.com>

---------

Signed-off-by: Israel Blancas <iblancasa@gmail.com>
@rriverak
Copy link

same issues here, we use jaeger-operator v1.49.0

last successful spark job is 20 days ago

NAME                                                 COMPLETIONS   DURATION   AGE
jaeger-operator-jaeger-cassandra-schema-job          1/1           37s        601d
jaeger-operator-jaeger-spark-dependencies-28491835   1/1           32s        22d
jaeger-operator-jaeger-spark-dependencies-28493275   1/1           36s        21d
jaeger-operator-jaeger-spark-dependencies-28494715   1/1           37s        20d
jaeger-operator-jaeger-spark-dependencies-28523515   0/1           12h        12h

operator's spark job has no tag on the image, so "latest" is used as a fallback.

  Containers:
   jaeger-operator-jaeger-spark-dependencies:
    Image:      ghcr.io/jaegertracing/spark-dependencies/spark-dependencies
    Port:       <none>
    Host Port:  <none>

the image that is used can be found here.

https://github.com/jaegertracing/spark-dependencies/pkgs/container/spark-dependencies%2Fspark-dependencies/versions?filters%5Bversion_type%5D=tagged
chrome_xAJrIp7MNL
sadly there is only 1 tag which was overwritten 20 days ago, that makes any Workaround impossible 😔
any ideas ? In my opinion at least the old image should be provided..

@iblancasa
Copy link
Collaborator Author

We might open an issue in that repository.

@rriverak
Copy link

rriverak commented Apr 2, 2024

as a workaround, we pin the spark-dependencies to the old untaggt Image sha256:683963b95bafb0721f3261a49c368c7bdce4ddcb04a23116c45068d254c5ec11

we use the Helm Values of the jaeger-operator to override the DockerImage of dependencies in storage section:

jaeger:
  create: true
  spec:
    strategy: production
    storage:
      type: cassandra
      options:
        cassandra:
          servers: xxx
          keyspace: jaeger
          username: xxx
          password: xxx
      dependencies:
        image: ghcr.io/jaegertracing/spark-dependencies/spark-dependencies@sha256:683963b95bafb0721f3261a49c368c7bdce4ddcb04a23116c45068d254c5ec11

However, the current image is broken and should be fixed. In my opinion, the jaeger-operator itself should also pin its own dependencies to avoid this kind of Errors in Production.

@iblancasa
Copy link
Collaborator Author

@rriverak would you like to send a PR?

@rriverak
Copy link

rriverak commented Apr 2, 2024

@iblancasa I'm not sure. We can solve this on several levels... which solution are we looking for? then i cloud provide a PR accordingly.

What is our path?

I would be happy if spark-dependency shows initiative here and fixes the problems with the Image and switches to a proper Versioning. If this does not happen, then one of the remaining two solutions must do the job.

@iblancasa
Copy link
Collaborator Author

I would prefer the Fix Spark dependencies option. After that one, set the version in the Jaeger operator. The third one is not a real solution since a lot of people are not using Helm to install the operator.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants