docs/design: Support Spilling Unparalleled HashAgg #25792

wshwsh12 · 2021-06-28T06:06:05Z

What problem does this PR solve?

Issue Number: close #xxx

Problem Summary:

What is changed and how it works?

Proposal: xxx

What's Changed: Design docs.

How it Works:

Related changes

PR to update pingcap/docs/pingcap/docs-cn:

Release note

No release note

docs/design/2021-06-23-spilled-unparallel-hashagg.md

mmyj · 2021-07-07T04:45:53Z

docs/design/2021-06-23-spilled-unparallel-hashagg.md

+
+## Impacts & Risks
+
+* Memory will still grow without increasing the number of new tuples in HashMap for distinct aggregate function.


I don't understand this. It's seem that there is a contradiction between memory still grow and without increasing the number of new tuples.

In fact, it is not contradict. For the distinct agg function, it is necessary to record those values that have appeared. We are using a set to record this information. This set will still grow during the aggregation process without increasing the number of new tuples in aggPartialResultMapper.
For example,

type partialResult4CountDistinctInt struct { valSet set.Int64SetWithMemoryUsage }

mmyj

lgtm

tisonkun · 2021-07-11T03:24:24Z

docs/design/2021-06-23-spilled-unparallel-hashagg.md

+# Proposal: Support Spilling Unparalleled HashAgg
+
+- Author(s): [@wshwsh12](https://github.com/wshwsh12)
+- Discussion PR: N/A


Suggested change

- Discussion PR: N/A

- Discussion PR: https://github.com/pingcap/tidb/pull/25792

docs/design/2021-06-23-spilled-unparallel-hashagg.md

XuHuaiyu · 2021-07-12T07:20:46Z

docs/design/2021-06-23-spilled-unparallel-hashagg.md

+* When the unparallel-agg exceeds the memory quota, this feature helps reduce memory usage and run the sql successfully.
+* When the parallel-agg exceeds the memory quota, the SQL will be canceled before. After the agg-concurrency args are set to 1, the SQL can run successfully.
+* When the ndv of the data is low, the SQL contains distinct function will be canceled before. After the agg-concurrency args are set to 1, the SQL can run successfully.
+* When the ndv of the data is high, the SQL contains distinct function will be canceled before. After the agg-concurrency args are set to 1, the SQL can be canceled successfully if there is insufficient memory.


Why do we need to set the concurrency-related args when there exists an aggregation function with the keyword distinct.

XuHuaiyu

LGTM

ti-chi-bot · 2021-07-12T07:53:55Z

[REVIEW NOTIFICATION]

This pull request has been approved by:

XuHuaiyu
mmyj

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

XuHuaiyu · 2021-07-12T07:56:37Z

/merge

ti-chi-bot · 2021-07-12T07:56:40Z

This pull request has been accepted and is ready to merge.

Commit hash: d78daac

design docs

1b2799e

ti-chi-bot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Jun 28, 2021

wshwsh12 mentioned this pull request Jun 28, 2021

executor: support spill intermediate data for unparalleled hash agg #25714

Merged

wshwsh12 requested a review from XuHuaiyu June 28, 2021 08:35

wshwsh12 mentioned this pull request Jul 2, 2021

Support spill HashAgg to disk #25882

Closed

5 tasks

mmyj reviewed Jul 7, 2021

View reviewed changes

fix typo

264ad68

mmyj approved these changes Jul 7, 2021

View reviewed changes

ti-chi-bot added the status/LGT1 Indicates that a PR has LGTM 1. label Jul 7, 2021

tisonkun reviewed Jul 11, 2021

View reviewed changes

XuHuaiyu reviewed Jul 12, 2021

View reviewed changes

address comments

2b43e6c

wshwsh12 requested a review from XuHuaiyu July 12, 2021 06:36

XuHuaiyu reviewed Jul 12, 2021

View reviewed changes

address comments

d78daac

wshwsh12 requested a review from XuHuaiyu July 12, 2021 07:46

XuHuaiyu approved these changes Jul 12, 2021

View reviewed changes

ti-chi-bot added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Jul 12, 2021

ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label Jul 12, 2021

XuHuaiyu added component/docs and removed status/can-merge Indicates a PR has been approved by a committer. labels Jul 12, 2021

ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label Jul 12, 2021

ti-chi-bot added 4 commits July 12, 2021 16:55

Merge branch 'master' into design-docs

c6a6f52

Merge branch 'master' into design-docs

54cf7d5

Merge branch 'master' into design-docs

4795564

Merge branch 'master' into design-docs

759caa0

ti-chi-bot merged commit 52f1e0e into pingcap:master Jul 12, 2021

Mini256 mentioned this pull request Jul 12, 2021

Tars plugin has redundant merge base operations ti-community-infra/tichi#643

Closed

1 task

wshwsh12 deleted the design-docs branch January 29, 2022 07:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs/design: Support Spilling Unparalleled HashAgg #25792

docs/design: Support Spilling Unparalleled HashAgg #25792

wshwsh12 commented Jun 28, 2021

mmyj Jul 7, 2021

wshwsh12 Jul 7, 2021

mmyj left a comment

tisonkun Jul 11, 2021

XuHuaiyu Jul 12, 2021

wshwsh12 Jul 12, 2021

XuHuaiyu left a comment

ti-chi-bot commented Jul 12, 2021

XuHuaiyu commented Jul 12, 2021

ti-chi-bot commented Jul 12, 2021


		## Impacts & Risks

		* Memory will still grow without increasing the number of new tuples in HashMap for distinct aggregate function.

	- Discussion PR: N/A
	- Discussion PR: https://github.com/pingcap/tidb/pull/25792

docs/design: Support Spilling Unparalleled HashAgg #25792

docs/design: Support Spilling Unparalleled HashAgg #25792

Conversation

wshwsh12 commented Jun 28, 2021

What problem does this PR solve?

What is changed and how it works?

Related changes

Release note

mmyj Jul 7, 2021

Choose a reason for hiding this comment

wshwsh12 Jul 7, 2021

Choose a reason for hiding this comment

mmyj left a comment

Choose a reason for hiding this comment

tisonkun Jul 11, 2021

Choose a reason for hiding this comment

XuHuaiyu Jul 12, 2021

Choose a reason for hiding this comment

wshwsh12 Jul 12, 2021

Choose a reason for hiding this comment

XuHuaiyu left a comment

Choose a reason for hiding this comment

ti-chi-bot commented Jul 12, 2021

XuHuaiyu commented Jul 12, 2021

ti-chi-bot commented Jul 12, 2021