-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Caused by: java.lang.NoClassDefFoundError: Could not initialize class org.apache.camel.kafkaconnector.CamelSinkConnectorConfig #251
Comments
Can you please show your configuration file? |
@oscerd
Here is my KafkaConnector deployment file
|
@saranyaeu2987 can you please try with this Docker file:
|
Worked. Thanks a lot!!. BTW, I have this
|
The simple language is not resolved in the sink.url |
The S3 sink connector doens't provide a batch size option because it's mapped on the camel producer options actually. The producer doesn't work as a batch job. |
So, this property CamelBatchSize in https://camel.apache.org/manual/latest/batch-consumer.html |
Yes, because you're mixing up the concepts: a consumer in Camel will be used when you're using a Source connector, so you'll have the consumer options in that case. camel.source.endpoint.maxMessagesPerPoll And define how much messages will be part of the batch consumed. On the sink side you will be using the producer options. Source options: https://camel.apache.org/camel-kafka-connector/latest/connectors/camel-aws-s3-kafka-source-connector.html You're using a sink connector (in Camel terms you're using a producer) |
Got it. Here is my yaml file for your reference
|
I don't think it's possible with simple language. So I guess you'll need to add something dynamic there. |
@oscerd Why is it's not possible in simple language? Pardon me if I again mixup concepts |
It's not possible because you're declaring this stuff in a static file, simple language should be used in a real Camel route. The kafka configuration for connector is not dynamic, it's static. |
If that the case, how file variable is resolved?
======================================================================= Basically, I wanted to preserve all Kafka topic data consumed via sinkconnector in S3 without overwriting. Any suggestion on
Current State: S3 file overwritten after every consumption |
We are working on enabling the dynamic resolver with #252 |
@oscerd Any suggestions on code change for batch consume mode for s3 sink ? |
It's not supported on the sink side. The camel producer is not designed for batch. |
@saranyaeu2987 I am closing this one for the moment since work is tracked in #252 . Feel free to reopen this one or open another issue if something still needs to be clarified. |
@saranyaeu2987 Did you managed to dump kafka data onto S3 in batches. With your examples, i manged to get data moving from Kafka topic on to S3, however for every new Kafka message, its creating a object inside S3 bucket, which is highly inefficient. So i am wondering if you managed to fix this using some camel.sink properties. Any help on this is much appreciated. |
I know it's inefficient. I checked with camel team and they mentioned that
feature is not available.
Let me know if you find suitable solution.
…On Sat, Jun 13, 2020, 12:14 AM karan singh ***@***.***> wrote:
@saranyaeu2987 <https://github.com/saranyaeu2987> Did you managed to dump
kafka data onto S3 in batches. With your examples, i manged to get data
moving from Kafka topic on to S3, however for every new Kafka message, its
creating a object inside S3 bucket, which is highly inefficient.
So i am wondering if you managed to fix this using some camel.sink
properties. Any help on this is much appreciated.
[image: image]
<https://user-images.githubusercontent.com/9701902/84560513-b26b5400-ad62-11ea-8dc1-2e4e628fd741.png>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#251 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AEXOLXQLZ7BEFYSOA4IWNU3RWMDKTANCNFSM4NNIPOPA>
.
|
It's how the component has been designed |
@oscard
Any plan to add it as a new feature request?
…On Sat, Jun 13, 2020, 12:27 AM Andrea Cosentino ***@***.***> wrote:
It's how the component has been designed
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#251 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AEXOLXVB32SQ2QGDMYXPA3DRWME3PANCNFSM4NNIPOPA>
.
|
I don't see how it would be useful. In camel, when you send a message to s3 through a producer it will be written as s3 object directly. In S3 there is no append operation. So I really don't see why changing the behavior. Also batch operation in what sense? Writing multiple lines on a same file? Or write multiple file in one shot? There is no batch support in S3 sdk v1 as far as I know. |
@oscard
Looking for option to reduce number of files generated in s3 like grouping
all messages for a timeframe (say 5 seconds) before writing and generate
one s3 file for that timeframe.
When we add a hive schema on top of s3 location, query performance reduces
with large number is small files.
One more question, is there a way to autocreate a folder in s3 using simple
(yymmdd) given in camel.sink.url?
camel.sink.url: aws-s3://selumalai-kafka-s3?keyName=${date:now:yyyyMMdd}/${exchangeId}
…On Sat, Jun 13, 2020, 12:37 AM Andrea Cosentino ***@***.***> wrote:
I don't see how it would be useful. In camel, when you send a message to
s3 through a producer it will be written as s3 object directly. In S3 there
is no append operation. So I really don't see why changing the behavior.
Also batch operation in what sense? Writing multiple lines on a same file?
Or write multiple file in one shot? There is no batch support in S3 sdk v1
as far as I know.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#251 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AEXOLXRBTDT662O4VHWC5ADRWMGCFANCNFSM4NNIPOPA>
.
|
There is no plan for this at the moment. Also the aws S3 component is old, so we won't introduce new stuff there. It's easier on aws2-s3. For the second question, no. There is no folder concept in S3 and we didn't focus on it |
|
|
@oscerd First of all thanks for building this connector, I am happy that it works \o/ I like to share a use case of Kafka Messages ===Batching===> S3 (inspired by RHT Open Hybrid Edge Computing project ) Suppose we wanted to ingest live sensor data from a fleet of thousands of IoT devices (ex : connected smart cars, industrial sensors etc) into Kafka. We wanted to persist data for long term processing and big data analytics, so data should be moved to Object Storage (s3). Given, this if we have, say 10K devices, each device is sending 1 message every 10 seconds (this is for simple calculation, but in reality, the rate is much higher than this). So in 24 hours, this will generate 10000 x 8640 = 86400000 Messages / Day So if these 84Million messages, occupy 1 Object per message, we will end up having 84Million objects in s3 each day, which is overwhelming. Ideally, if we can batch this something like, messages generated in 10 minutes will get stored in 1 S3 Object (file). This can drastically reduce the number of objects in S3 bucket (from 84M to 144K objects) 10000 x 8640 = 86400000 Messages / 600 = 144K S3 objects Answering your previous comment point
With batch, we mean, writing multiples lines (messages) on a same file BTW this is a feature request not a but :) happy to discuss more on this topic |
Directly in S3 AWS SDK, it's not possible to do this. In the Camel component this is not something we could do directly, because S3 doesn't support appending to an S3 object. |
I'll have a look to Kinesis Firehose in combination to S3. |
I am trying to run camel-kafka-connector in my minikube and Strimzi but getting
Caused by: java.lang.NoClassDefFoundError: Could not initialize class org.apache.camel.kafkaconnector.CamelSinkConnectorConfig
Here is my image Docker file
Here is my plugins folder contains the following jars
When I see the pod logs, this is what I see
I dont see org.apache.camel.kafkaconnector.CamelSinkConnectorConfig getting loaded from camel-kafka-connector-0.2.0.jar
Which I feel would have caused the following error
**Full log from connector pod **
logs-from-my-connect-cluster-connect-in-my-connect-cluster-connect-8866c5d89-rd4zs.txt
Any suggestions??
The text was updated successfully, but these errors were encountered: