Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KAFKA-12522: Cast SMT should allow null value records to pass through #10375

Merged
merged 1 commit into from
May 21, 2021

Conversation

dosvath
Copy link
Contributor

@dosvath dosvath commented Mar 23, 2021

Problem
The current Cast SMT fails on a null record value (or a null record key), which is problematic for tombstone records. When a tombstone record reaches the transformation the error below is thrown:

With schema:

Cannot list fields on non-struct type
org.apache.kafka.connect.errors.DataException: Cannot list fields on non-struct type
	at org.apache.kafka.connect.data.ConnectSchema.fields(ConnectSchema.java:179)
	at org.apache.kafka.connect.transforms.Cast.getOrBuildSchema(Cast.java:190)

For schemaless:

Caused by: org.apache.kafka.connect.errors.DataException: Only Map objects supported in absence of schema for [cast types], found: null
at org.apache.kafka.connect.transforms.util.Requirements.requireMap(Requirements.java:38)

Solution
Null value records should instead be allowed to pass through as there is no cast transformation to be done, with the benefit of allowing the connector to handle the tombstone records as intended.

Testing
Added SMT unit tests to verify the records pass through when the keys or values are null.

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

@dongjinleekr
Copy link
Contributor

LGTM.

@vvcephei @ewencp @ijuma Could you have a look? IMHO this PR can be included in 2.8.

@kkonstantine
Copy link
Contributor

To be more equipped to say whether this issue can be considered a release blocker or not (which is something that should be called out on the dev mailing list for any releases that are in progress) it would be good to know whether this is a regression or bug that has escaped several releases. @dosvath do you happen to know?

@dosvath
Copy link
Contributor Author

dosvath commented Mar 23, 2021

To be more equipped to say whether this issue can be considered a release blocker or not (which is something that should be called out on the dev mailing list for any releases that are in progress) it would be good to know whether this is a regression or bug that has escaped several releases. @dosvath do you happen to know?

It seems it's a bug that has escaped several releases I didn't see any point in the history where null is handled to make this a regression.

@dosvath
Copy link
Contributor Author

dosvath commented Mar 23, 2021

LGTM.

@vvcephei @ewencp @ijuma Could you have a look? IMHO this PR can be included in 2.8.

Thanks @dongjinleekr I will update the branch.

@dosvath dosvath changed the base branch from trunk to 2.8 March 23, 2021 15:51
@kkonstantine
Copy link
Contributor

Thanks for checking @dosvath.
That means that this fix probably won't cut it for a release blocker at this late stage in the release process of 2.8 (or other branches that are in the process of generating release candidates). We'll either merge to trunk and wait before we backport, or wait a bit more altogether.

Finally, in this project we always target trunk on our PRs and then we cherry-pick (unless we explicitly need to backport only to a release branch). So your target was correctly pointing to trunk. I think @dongjinleekr was suggesting that we should cherry-pick.

@dosvath dosvath changed the base branch from 2.8 to trunk March 24, 2021 18:50
@dosvath
Copy link
Contributor Author

dosvath commented Mar 24, 2021

Thanks @kkonstantine I updated the branch.

@dosvath
Copy link
Contributor Author

dosvath commented Apr 15, 2021

@kkonstantine @dongjinleekr how should we proceed?

@dosvath
Copy link
Contributor Author

dosvath commented May 6, 2021

@ewencp @mjsax one more ping on this

@mjsax
Copy link
Member

mjsax commented May 14, 2021

Not familiar with Connect code. I guess @rhauch @kkonstantine @mimaison @C0urante @wicknicks should be able to review.

Copy link
Contributor

@C0urante C0urante left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, nice simple fix and great test coverage. Thanks Daniel!

Copy link
Contributor

@dongjinleekr dongjinleekr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good!!

Copy link
Member

@mimaison mimaison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the PR

@mimaison mimaison merged commit 0c707b1 into apache:trunk May 21, 2021
ijuma added a commit to ijuma/kafka that referenced this pull request May 26, 2021
…e-allocations-lz4

* apache-github/trunk: (43 commits)
  KAFKA-12800: Configure generator to fail on trailing JSON tokens (apache#10717)
  MINOR: clarify message ordering with max in-flight requests and idempotent producer (apache#10690)
  MINOR: Add log identifier/prefix printing in Log layer static functions (apache#10742)
  MINOR: update java doc for deprecated methods (apache#10722)
  MINOR: Fix deprecation warnings in SlidingWindowedCogroupedKStreamImplTest (apache#10703)
  KAFKA-12499: add transaction timeout verification (apache#10482)
  KAFKA-12620 Allocate producer ids on the controller (apache#10504)
  MINOR: Kafka Streams code samples formating unification (apache#10651)
  KAFKA-12808: Remove Deprecated Methods under StreamsMetrics (apache#10724)
  KAFKA-12522: Cast SMT should allow null value records to pass through (apache#10375)
  KAFKA-12820: Upgrade maven-artifact dependency to resolve CVE-2021-26291
  HOTFIX: fix checkstyle issue in KAFKA-12697
  KAFKA-12697: Add OfflinePartitionCount and PreferredReplicaImbalanceCount metrics to Quorum Controller (apache#10572)
  KAFKA-12342: Remove MetaLogShim and use RaftClient directly (apache#10705)
  KAFKA-12779: KIP-740, Clean up public API in TaskId and fix TaskMetadata#taskId() (apache#10735)
  KAFKA-12814: Remove Deprecated Method StreamsConfig getConsumerConfigs (apache#10737)
  MINOR: Eliminate redundant functions in LogTest suite (apache#10732)
  MINOR: Remove unused maxProducerIdExpirationMs parameter in Log constructor (apache#10723)
  MINOR: Updating files with release 2.7.1 (apache#10660)
  KAFKA-12809: Remove deprecated methods of Stores factory (apache#10729)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants