Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Flink] Optimize CDC sink serde with Fury #307

Merged
merged 6 commits into from
Aug 30, 2023

Conversation

xuchen-plus
Copy link
Contributor

@xuchen-plus xuchen-plus commented Aug 25, 2023

Fury is an opensourced serialization library using JIT to improve performance.

In LakeSoul's CDC sync job, we need to pass before and after RowData as well as RowType in each record. From Flink's flamegraph we can confirm these objects are causing excessive serde burden.

Using Fury, single core benchmark shows ~80% improvement on end-to-end throughput (numRecordsInPerSecond from 5800 to 10400).

Before:

img_v2_c9f87b2a-667c-43f9-b970-f378f6fd006g

After using Fury:

img_v2_757278f5-424d-4b8d-8359-dca3a151eddg

@xuchen-plus xuchen-plus added enhancement New feature or request flink flink support into lakesoul labels Aug 25, 2023
Signed-off-by: chenxu <chenxu@dmetasoul.com>
Signed-off-by: chenxu <chenxu@dmetasoul.com>
Signed-off-by: chenxu <chenxu@dmetasoul.com>
Signed-off-by: chenxu <chenxu@dmetasoul.com>
Signed-off-by: chenxu <chenxu@dmetasoul.com>
Signed-off-by: chenxu <chenxu@dmetasoul.com>
@xuchen-plus xuchen-plus merged commit c3271e1 into lakesoul-io:main Aug 30, 2023
@xuchen-plus xuchen-plus deleted the flink_serde_opt branch August 30, 2023 07:02
@chaokunyang
Copy link

This is great! Glad to see fury speed up lake soul performance

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request flink flink support into lakesoul
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

4 participants