Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CloudEvents support #252

Merged
merged 13 commits into from
Feb 26, 2020
Merged

CloudEvents support #252

merged 13 commits into from
Feb 26, 2020

Conversation

bsideup
Copy link
Owner

@bsideup bsideup commented Jan 30, 2020

CloudEvents is an essential choice for Liiklus since it is the standard that we were missing when we created Liiklus.

It introduces a clear notion of id, type and other fields that can be helpful for applying schema, RBAC, metrics and many other features

At this stage, CloudEvents v1 is used as internal representation, and LiiklusEvent at the API level since there is no Protobuf format for CloudEvents yet:
cloudevents/spec#504
LiiklusEvent follows the v1 of the spec. Once Protobuf definitions are available, we will deprecate LiiklusEvent and start using CloudEvent message type. oneof and format are there for a smooth migration.

The implementation is mostly backward compatible, but, if you have events in the old, binary format, you will need to provide a plugin to "upcast" them (use RecordPreProcessor/RecordPostProcessor)

@bsideup bsideup added the enhancement New feature or request label Jan 30, 2020
.withHeaders(() -> (Map) headers)
.withPayload(() -> (String) valueBuffer)
.unmarshal(),
it -> ByteBuffer.wrap(Json.binaryEncode(it)).asReadOnlyBuffer()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have yet to fully review what this PR exactly does, but the usage of JSON seems wrong. In memory should just store events as object references, shouldn't it?
(and we don't want to use JSON anywhere, in-memory or kafka, as a matter of fact)

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this Function converts the in-memory representation to a ByteBuffer when pre/post-processor is unable to handle the in-memory format, mostly due to legacy reasons.

Eventually, the processors API will be changed to use CloudEvent as the type, not Object, and the functions will go away.

In other words, we use CloudEvent as the in-memory type and we fallback to the JSON representation when bytes are requested (either BINARY consumer format or pre/post-processors)

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

D'oh! Of course 1 second after replying I got what you mean 😂

in-memory forces the serialization to avoid hard references and non-serializable objects that otherwise would fail with kafka/pulsar/any other real storage. JSON was selected as the easiest

request.getKey().asReadOnlyByteBuffer(),
it -> (ByteBuffer) it,
cloudEventBuilder.build(),
it -> ByteBuffer.wrap(Json.binaryEncode(it)).asReadOnlyBuffer()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, this feels weird, for reasons slightly different from the in-memory case (assuming I understand what this code does from a very quick glance). We don't want to use JSON as a preferred way of storing data.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First part of this answer explains this part:
#252 (comment)

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tl;dr: this function is only executed when some pre-processor asks for the ByteBuffer of this envelope (aka "serialize it for me"). It is there for the backward compatibility and will be removed once CloudEvent becomes the only type of Envelope's value.

@bsideup bsideup modified the milestones: 0.9.1, next Feb 1, 2020

if (!(rawValue instanceof CloudEvent)) {
// TODO Add Envelope#event and make CloudEvent a fist-class citizen
throw new IllegalArgumentException("Must be a CloudEvent!");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying this PR in the context of projectriff, it seems I'm always hitting this line. I'm sure I'm doing something wrong, but I don't know what yet.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make sure you publish with the new protocol (and set the liiklusEvent field)

@ericbottard
Copy link

Successfully tested this is the context of riff, using Kafka:

kafkacat -b kafkabroker:9092 -t default_out -C \
  -f '\nKey (%K bytes): %k
  Value (%S bytes): %s
  Timestamp: %T
  Partition: %p
  Offset: %o
  Headers: %h\n'
% Reached end of topic default_out [0] at offset 0

Key (0 bytes):
  Value (2 bytes): 10
  Timestamp: 1580910284299
  Partition: 0
  Offset: 0
  Headers: ce-source=proc,ce-specversion=1.0,ce-type=riff-event,ce-id=431ed262-4f61-4ced-9115-341e9f2473b3,Content-Type=application/json

@bsideup
Copy link
Owner Author

bsideup commented Feb 5, 2020

@ericbottard thanks for checking! 👍

FYI I discovered that I was using wrong mappers (from HTTP) and adjusted it to follow the Kafka protocol binding described here:
https://github.com/cloudevents/spec/blob/v1.0/kafka-protocol-binding.md

) {
Map<String, String> result = new HashMap<>();

extensions.forEach((key, value) -> {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a NO-OP

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is result.put for all non-null values, unless I am missing something :)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, my bad, I thought this was acting in place (which would cause concurrent modification anyway). So my other comment is moot too.

@bsideup
Copy link
Owner Author

bsideup commented Feb 10, 2020

@ericbottard I remember that you needed the extensions support.
Could you please try the latest state and verify that it also works well for you?
Thanks!

@ericbottard
Copy link

This is what I see when I set a Riff-Rox: true extension, I don't think this is mapped correctly (according to this):

kafkacat -b kafkabroker:9092 -t default_out -C \
  -f '\nKey (%K bytes): %k
  Value (%S bytes): %s
  Timestamp: %T
  Partition: %p
  Offset: %o
  Headers: %h\n'
% Reached end of topic default_out [0] at offset 12

Key (0 bytes):
  Value (1 bytes): 4
  Timestamp: 1582130904180
  Partition: 0
  Offset: 12
  Headers: ce_specversion=1.0,ce_id=54dcd19f-d73b-436f-b975-4e7626810eb9,ce_type=riff-event,Riff-Rox=true,ce_source=source-todo,ce_datacontenttype=application/json
% Reached end of topic default_out [0] at offset 13

@bsideup
Copy link
Owner Author

bsideup commented Feb 19, 2020

@ericbottard that's something I wanted to fix. Even in the official SDK, custom extensions are not prefixed:
cloudevents/sdk-java#93

@bsideup
Copy link
Owner Author

bsideup commented Feb 19, 2020

@ericbottard keep in mind that Riff-Rox is not a valid extension key AFAIK.

I haven't decided yet whether liiklus should validate CloudEvent's values, btw (for performance reasons). WDYT?

@ericbottard
Copy link

keep in mind that Riff-Rox is not a valid extension key AFAIK.

Using riffrox as a key (all lowercase, no dash) ends up in the same situation: the header is riffrox, and not ce-riffrox. Would you say this is expected?

When you say

that's something I wanted to fix

are you implying that liiklus writes the header, or are you relying on the (currently broken) sdk?

In any case, the values seem to propagate "correctly" from publication to Kafka.

@bsideup
Copy link
Owner Author

bsideup commented Feb 21, 2020

@ericbottard I will fix it in Liiklus later today (make it prefix all headers with ce-).
I've decided to not use the SDK since it is not optimized for allocations and does not implement the spec correctly yet.

@bsideup
Copy link
Owner Author

bsideup commented Feb 23, 2020

@ericbottard I fixed the extensions mapping issue :)

@bsideup bsideup marked this pull request as ready for review February 23, 2020 16:07
@ericbottard
Copy link

I can confirm that I saw our headers transit all the way using extensions in Kafka.
To me this PR is ready to go!

@bsideup
Copy link
Owner Author

bsideup commented Feb 24, 2020

@ericbottard thanks a lot for checking! 👍

Copy link
Collaborator

@lanwen lanwen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, lgtm at this moment :) will see how it would be to implement a plugin! :D

@bsideup bsideup merged commit e5620a2 into master Feb 26, 2020
@delete-merged-branch delete-merged-branch bot deleted the cloudevents branch February 26, 2020 09:02
@cbraynor
Copy link

cbraynor commented Mar 3, 2020

I'm curious to see if cloudevents/spec#576 meets your needs?

@bsideup
Copy link
Owner Author

bsideup commented Mar 3, 2020

@drtriumph thanks! I will definitely have a look next week when I am back from vacation 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants