Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[enhancement](load) avoid schema copy to reduce cpu usage #16034

Merged
merged 4 commits into from
Jan 28, 2023

Conversation

zhannngchen
Copy link
Contributor

Proposed changes

Issue Number: close #xxx

Problem summary

some user reports that the cpu usage is quite high when data loading.
The flame graph shows that copy of the schema cost lots of cpu resources.
image

Checklist(Required)

  1. Does it affect the original behavior:
    • Yes
    • No
    • I don't know
  2. Has unit tests been added:
    • Yes
    • No
    • No Need
  3. Has document been added or modified:
    • Yes
    • No
    • No Need
  4. Does it need to update dependencies:
    • Yes
    • No
  5. Are there any changes that cannot be rolled back:
    • Yes (If Yes, please explain WHY)
    • No

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

be/src/olap/delta_writer.cpp Outdated Show resolved Hide resolved
be/src/olap/tablet_schema.h Show resolved Hide resolved
@hello-stephen
Copy link
Contributor

hello-stephen commented Jan 17, 2023

TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 36.76 seconds
load time: 509 seconds
storage size: 17123237211 Bytes
https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/tmp/20230119030741_clickbench_pr_83804.html

@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@@ -242,7 +242,7 @@ Status TabletsChannel::_open_all_writers(const PTabletWriterOpenRequest& request
wrequest.tuple_desc = _tuple_desc;
wrequest.slots = index_slots;
wrequest.is_high_priority = _is_high_priority;
wrequest.ptable_schema_param = request.schema();
wrequest.table_schema_param = _schema;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_schema is inited from request.schema(), and it's life cycle is longer than delta writers, so we can use _schema directly to avoid copying the schema twice(the first time is here, the second time is in the ctor of DeltaWriter)

Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Jan 28, 2023
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. dev/1.2.2-merged reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants