Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

infoschema: optimize 'select count(*) from information_schema.tables' for v2 #55574

Merged
merged 9 commits into from
Sep 3, 2024

Conversation

tiancaiamao
Copy link
Contributor

@tiancaiamao tiancaiamao commented Aug 21, 2024

What problem does this PR solve?

Issue Number: close #55515
ref #50959

Problem Summary:

mysql> select count(*) from information_schema.tables;
ERROR 9006 (HY000): GC life time is shorter than transaction duration, transaction starts at 2024-08-20 07:13:02.343 +0800 CST, GC safe point is 2024-08-20 07:13:24.893 +0800 CST

What changed and how does it work?

Make a special optimization for reading INFORMATION_SCHEMA.TABLES, that is, if only table_name or table_schema are visited, make the query pure in-memory operation.

I tried to support select count(*) but found that the aggregation always use the first column for *, and in that case for INFORMATION_SCHEMA.TABLES, the column is table_catalog. And changing the select count(*) to using other columns seems more difficult than I thought.

Check List

Tests

  • Unit test
  • Integration test

Test with 1M tables:

mysql> select count(table_name) from tables;
+-------------------+
| count(table_name) |
+-------------------+
|           1007314 |
+-------------------+
1 row in set (1.54 sec)

Without this optimize:

mysql> select count(*) from tables;
+----------+
| count(*) |
+----------+
|  1007315 |
+----------+
1 row in set (9 min 50.39 sec)

image

  • Manual test (add detailed scripts or steps below)
  • No need to test
    • I checked and no code files have been changed.

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

None

@ti-chi-bot ti-chi-bot bot added do-not-merge/needs-linked-issue release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Aug 21, 2024
Copy link

tiprow bot commented Aug 21, 2024

Hi @tiancaiamao. Thanks for your PR.

PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test all.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Copy link

codecov bot commented Aug 22, 2024

Codecov Report

Attention: Patch coverage is 93.68421% with 6 lines in your changes missing coverage. Please review.

Project coverage is 56.4415%. Comparing base (0583e84) to head (0c33bf2).
Report is 1 commits behind head on master.

Additional details and impacted files
@@                Coverage Diff                @@
##             master     #55574         +/-   ##
=================================================
- Coverage   72.7958%   56.4415%   -16.3543%     
=================================================
  Files          1588       1716        +128     
  Lines        443469     621216     +177747     
=================================================
+ Hits         322827     350624      +27797     
- Misses       100772     246943     +146171     
- Partials      19870      23649       +3779     
Flag Coverage Δ
integration 37.2600% <84.2105%> (?)
unit 71.9841% <93.6842%> (+0.0776%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
dumpling 52.9567% <ø> (ø)
parser ∅ <ø> (∅)
br 51.8398% <ø> (+6.4086%) ⬆️

@tiancaiamao
Copy link
Contributor Author

/retest

Copy link

tiprow bot commented Aug 22, 2024

@tiancaiamao: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@tiancaiamao tiancaiamao changed the title infoschema: optimize 'select count(table_name) from information_schema.tables' for v2 infoschema: optimize 'select count(*) from information_schema.tables' for v2 Aug 22, 2024
@tiancaiamao
Copy link
Contributor Author

/retest

Copy link

tiprow bot commented Aug 23, 2024

@tiancaiamao: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@tiancaiamao
Copy link
Contributor Author

/test check-dev

Copy link

tiprow bot commented Aug 23, 2024

@tiancaiamao: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/test check-dev

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@tiancaiamao
Copy link
Contributor Author

/ok-to-test

@ti-chi-bot ti-chi-bot bot added the ok-to-test Indicates a PR is ready to be tested. label Aug 23, 2024
@bb7133
Copy link
Member

bb7133 commented Aug 23, 2024

Is select count(*) from information_schema.tables; widely used? Just curious.

IterateAllTableItems(visit func(infoschema.TableItem) bool)
})
if ok {
if x := ctx.Value("cover-check"); x != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe use a failpoint

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, failpoint is not as convient sometimes.

  • It works globally, so there's no good way for precise control with a specific query.
  • It requires make failpoint-enable, an extra step to enable the test

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It requires make failpoint-enable, an extra step to enable the test

I have added this feature for failpoint, it can be configurated with our other testing command line arguments like --tags=intest or -race https://github.com/pingcap/failpoint?tab=readme-ov-file#quick-start-use-failpoint-toolexec

@ti-chi-bot ti-chi-bot bot added the needs-1-more-lgtm Indicates a PR needs 1 more LGTM. label Aug 26, 2024
select table_name from information_schema.tables where table_name = 't';
explain format='brief' select table_name, table_schema from information_schema.tables;
select count(*) from information_schema.tables where table_name = 't';
select count(table_name) from information_schema.tables where table_name = 't';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
select count(table_name) from information_schema.tables where table_name = 't';
select count(table_name) from information_schema.tables where table_name = 't';

@ti-chi-bot ti-chi-bot bot added lgtm and removed needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels Aug 27, 2024
Copy link

ti-chi-bot bot commented Aug 27, 2024

[LGTM Timeline notifier]

Timeline:

  • 2024-08-26 01:49:11.122248659 +0000 UTC m=+748546.256698769: ☑️ agreed by lance6716.
  • 2024-08-27 14:12:25.522595228 +0000 UTC m=+879540.657045346: ☑️ agreed by tangenta.

Copy link
Contributor

@fixdb fixdb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link

ti-chi-bot bot commented Sep 3, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: fixdb, lance6716, tangenta

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot added the approved label Sep 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved lgtm ok-to-test Indicates a PR is ready to be tested. release-note-none Denotes a PR that doesn't merit a release note. sig/planner SIG: Planner size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
5 participants