Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase maxLocalExchangeBufferSize to 128MB #18212

Merged
merged 1 commit into from
Jul 14, 2023

Conversation

gaurav8297
Copy link
Member

This change will help increase the performance of writes when local exchange is used either due to partitioning the data or scaling the writers.

Benchmarks for unpartitioned write:

Before:ku
Input Size: 6B rows
Time: 4:46 mins
Peak Mem: 59.9GB

After:
Input Size: 6B rows
Time: 3:45 mins
Peak Mem: 61.2GB

Benchmarks for partitioned write:

Before:
Input Size: 3B rows (3 skewed partitions)
Time: 2:33 mins
Peak Mem: 89.8GB

After:
Input Size: 3B rows (3 skewed partitions)
Time: 1:57 mins
Peak Mem: 92.3GB

Description

Additional context and related issues

Release notes

( ) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
( ) Release notes are required, with the following suggested text:

# Section
* Fix some things. ({issue}`issuenumber`)

@cla-bot cla-bot bot added the cla-signed label Jul 10, 2023
@gaurav8297 gaurav8297 marked this pull request as ready for review July 10, 2023 12:54
@gaurav8297 gaurav8297 requested a review from sopel39 July 10, 2023 12:54
@gaurav8297 gaurav8297 force-pushed the higher_buffer_size_for_write branch from ca39f8c to 812a42b Compare July 11, 2023 00:46
@github-actions github-actions bot added tests:hive hive Hive connector labels Jul 11, 2023
Copy link
Member

@sopel39 sopel39 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm % comment. Please also add doc for new config

@sopel39
Copy link
Member

sopel39 commented Jul 11, 2023

there is test failure. Is it relevant?

@gaurav8297
Copy link
Member Author

there is test failure. Is it relevant?

It's this issue #18217

@ebyhr ebyhr mentioned this pull request Jul 11, 2023
Copy link
Member

@sopel39 sopel39 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm % @raunaqmorarka comment. Maybe we can just increase local exchange buffer overall

This change will help increase the performance of writes
when local exchange is used either because of
partitioning the data or scaling the writers.

Benchmarks for unpartitioned write:

Before:ku
Input Size: 6B rows
Time: 4:46 mins
Peak Mem: 59.9GB

After:
Input Size: 6B rows
Time: 3:45 mins
Peak Mem: 61.2GB

Benchmarks for partitioned write:

Before:
Input Size: 3B rows (3 skewed partitions)
Time: 2:33 mins
Peak Mem: 89.8GB

After:
Input Size: 3B rows (3 skewed partitions)
Time: 1:57 mins
Peak Mem: 92.3GB
@gaurav8297 gaurav8297 force-pushed the higher_buffer_size_for_write branch from 812a42b to 77a5b90 Compare July 13, 2023 10:15
@gaurav8297
Copy link
Member Author

gaurav8297 commented Jul 13, 2023

I update the default value to 128MB as @raunaqmorarka mentioned. Also, here is a benchmark report for tpcds.
tpcds_128_mb_buffer_size.pdf

@gaurav8297 gaurav8297 requested a review from raunaqmorarka July 13, 2023 10:19
@gaurav8297 gaurav8297 changed the title Increase maxLocalExchangeBufferSize to 128MB for writes Increase maxLocalExchangeBufferSize to 128MB Jul 13, 2023
@sopel39 sopel39 merged commit b9f94ae into trinodb:master Jul 14, 2023
@sopel39 sopel39 mentioned this pull request Jul 14, 2023
@github-actions github-actions bot added this to the 423 milestone Jul 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

3 participants