Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High-level ZooKeeper consumer #9

Closed
mumrah opened this issue Nov 19, 2012 · 7 comments
Closed

High-level ZooKeeper consumer #9

mumrah opened this issue Nov 19, 2012 · 7 comments
Labels

Comments

@mumrah
Copy link
Collaborator

mumrah commented Nov 19, 2012

No description provided.

@anentropic
Copy link
Contributor

I'm very new to Kafka... I am currently developing against a single-node Kafka cluster. I will only ever have a single topic but I see currently in the code here I have to manually supply the partition.

It's working at the moment hard-coding partition=0 in my code but I'm worried it will break in future. I had trouble finding a description of exactly how partitioning works but I'm using the kafka-producer-shell.sh as a producer which I think ends up using https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/producer/DefaultPartitioner.scala

...and I think that means eventually my messages are going to get automatically distributed across a number of partitions?

Is it correct that I won't be able to assume a single partition in production, and the ZooKeeper consumer will become neccessary?

thanks!

@mumrah
Copy link
Collaborator Author

mumrah commented Nov 22, 2012

Partitions are configured in server.properties "num.partitions". See the
Kafka config docs http://incubator.apache.org/kafka/configuration.html

For consumption, you must specify the partition you wish to consume from,
for production you can specify -1 and it will send the message to a random
partition.

It's generally advised to have at least a few partitions so you can have
multiple consumer threads reading data in parallel.

I definitely recommend the quickstart guide
http://incubator.apache.org/kafka/quickstart.html

Good luck!

Sent from my phone

On Nov 22, 2012, at 9:06 AM, anentropic notifications@github.com wrote:

I'm very new to Kafka... I am currently developing against a single-node
Kafka cluster. I will only ever have a single topic but I see currently in
the code here I have to manually supply the partition.

It's working at the moment hard-coding partition=0 in my code but I'm
worried it will break in future. I had trouble finding a description of
exactly how partitioning works but I'm using the kafka-producer-shell.sh as
a producer which I think ends up using
https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/producer/DefaultPartitioner.scala

...and I think that means eventually my messages are going to get
automatically distributed across a number of partitions?

Is it correct that I won't be able to assume a single partition in
production, and the ZooKeeper consumer will become neccessary?

thanks!


Reply to this email directly or view it on
GitHubhttps://github.com//issues/9#issuecomment-10635408.

@anentropic
Copy link
Contributor

thanks very much... I could not find where it states that you can't have multiple consumers consuming from the same partition... or is it just that in that case you'll process messages more than once due to difficulty synchronising offsets between consumers?

is this lib roughly corresponding to the 'low-level' consumer API in the design doc, while the proposed ZooKeeper consumer would correspond to the 'high-level' API?

@mumrah
Copy link
Collaborator Author

mumrah commented Nov 22, 2012

That's correct. So far I've implemented the protocol (FetchRequest,
ProduceRequest, etc) plus a few higher level things (KafkaQueue, consumer
generator).

The "high-level" consumer uses zookeeper to automatically coordinate which
consumers read from which partitions (this is detailed down towards the
bottom of the design doc http://incubator.apache.org/kafka/design.html).

You can definitely have multiple consumers read from the same
topic+partition, but they will receive the same data. Distribution of
messages is done through "smart" clients (like the zookeeper consumer)

Sent from my phone

On Nov 22, 2012, at 9:26 AM, anentropic notifications@github.com wrote:

thanks very much... I could not find where it states that you can't have
multiple consumers consuming from the same partition... or is it just that
in that case you'll process messages more than once due to difficulty
synchronising offsets between consumers?

is this lib roughly corresponding to the 'low-level' consumer API in the
design doc, while the proposed ZooKeeper consumer would correspond to the
'high-level' API?


Reply to this email directly or view it on
GitHubhttps://github.com//issues/9#issuecomment-10635926.

@anentropic
Copy link
Contributor

ah, so basically the job is to convert these Scala files to Python... :)

I see a bunch of things are changing in 0.8 too... maintaining a Kafka client seems a thankless task!

@mumrah
Copy link
Collaborator Author

mumrah commented Nov 22, 2012

Yea, supporting 0.8.x is another story. It should be backwards compatible,
though I've not tested this

Sent from my phone

On Nov 22, 2012, at 10:54 AM, anentropic notifications@github.com wrote:

ah, so basically the job is to convert these Scala
fileshttps://github.com/apache/kafka/tree/0.7.2/core/src/main/scala/kafka/consumerto
Python... :)

I see a bunch of things are changing in 0.8 too... maintaining a Kafka
client seems a thankless task!


Reply to this email directly or view it on
GitHubhttps://github.com//issues/9#issuecomment-10638385.

@mumrah
Copy link
Collaborator Author

mumrah commented Feb 20, 2013

Won't be doing this since 0.8 removes the ZK dependency from the client

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants