Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retry for range exceed error #2774

Merged
merged 8 commits into from
Mar 28, 2024

Conversation

shiyuhang0
Copy link
Member

@shiyuhang0 shiyuhang0 commented Mar 20, 2024

What problem does this PR solve?

TiSpark may set the wrong range to TiKV when using FetchHandleRDD. We have two guesses about this:

  1. TiSpark has a bug when splitting range with index scan. This bug only occurs under certain data.
  2. TiSpark supports cluster index but client-java does not support it. There is a problem with the coordination between them.

What is changed and how it works?

Since it is hard to find the root cause, we just log it and retry once when this error occurs. We use client-java's splitRangeByRegion method to avoid exceeding the bound issue. It seems this method can split the range correctly.

Spark Plan

= Physical Plan == *(1) ColumnarToRow +- TiSpark RegionTaskExec{downgradeThreshold=1000000000,downgradeFilter=[] +- RowToColumnar +- TiKV FetchHandleRDD{[table: items] IndexLookUp, Columns: item_primary_key@BYTES, item_id@VARCHAR(45), item_set_id@VARCHAR(45), product_id@VARCHAR(45), product_set_id@VARCHAR(45), point_of_sale_country@VARCHAR(2), merchant_id@LONG, merchant_item_id@VARCHAR(127), merchant_item_set_id@VARCHAR(127), domains@JSON, product_sources@JSON, image_signatures@JSON, normalized_short_link_clusters@JSON, canonical_links@JSON, feed_item_ids@JSON, feed_profile_ids@JSON, reconciled_data@JSON, source_data@JSON, cdc_change_indicator@JSON, cdc_new_values@JSON, cdc_old_values@JSON, created_time@LONG, arrival_time@LONG, updated_time@LONG, timestamp_data@JSON: { {IndexRangeScan(Index:item_id(item_id)): { RangeFilter: [], Range: [([t\200\000\000\000\000\000\023\226_i\200\000\000\000\000\000\000\003\000], [t\200\000\000\000\000\000\023\226_i\200\000\000\000\000\000\000\003\372])] }}; {TableRowIDScan} }, startTs: 448636486137151521}

@ti-chi-bot ti-chi-bot bot added the size/M label Mar 20, 2024
@shiyuhang0 shiyuhang0 force-pushed the pinterest_debug_version branch from eec9a08 to 46c5bd3 Compare March 20, 2024 04:46
@shiyuhang0 shiyuhang0 force-pushed the pinterest_debug_version branch from b7c05f2 to 1e38743 Compare March 25, 2024 06:33
This reverts commit 1e38743.
@shiyuhang0 shiyuhang0 force-pushed the pinterest_debug_version branch from adc7e2d to 248892a Compare March 26, 2024 02:21
assembly/pom.xml Outdated Show resolved Hide resolved
Copy link
Contributor

@v01dstar v01dstar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link

ti-chi-bot bot commented Mar 28, 2024

@v01dstar: adding LGTM is restricted to approvers and reviewers in OWNERS files.

In response to this:

lgtm

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Copy link

ti-chi-bot bot commented Mar 28, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: v01dstar
Once this PR has been reviewed and has the lgtm label, please ask for approval from shiyuhang0, ensuring that each of them provides their approval before proceeding. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@shiyuhang0
Copy link
Member Author

/run-all-tests tidb=release-6.1 tikv=release-6.1 pd=release-6.1

@shiyuhang0 shiyuhang0 changed the title Pinterest debug version Retry for range exceed error Mar 28, 2024
@shiyuhang0
Copy link
Member Author

/run-all-tests tidb=release-6.1 tikv=release-6.1 pd=release-6.1

@shiyuhang0 shiyuhang0 merged commit d892c36 into pingcap:release-3.2 Mar 28, 2024
2 of 3 checks passed
@shiyuhang0
Copy link
Member Author

/cherry-pick master

@ti-chi-bot
Copy link
Member

@shiyuhang0: new pull request created to branch master: #2777.

In response to this:

/cherry-pick master

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

ti-chi-bot pushed a commit to ti-chi-bot/tispark that referenced this pull request Mar 28, 2024
* print range

* update version

* retry once

* RC2

* Revert "RC2"

This reverts commit 1e38743.

* opt

* revert version
shiyuhang0 added a commit that referenced this pull request Mar 28, 2024
* print range

* update version

* retry once

* RC2

* Revert "RC2"

This reverts commit 1e38743.

* opt

* revert version

Co-authored-by: shi yuhang <52435083+shiyuhang0@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants