Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix scatter range receive npe when get store info of non-eixst store #1439

Merged
merged 2 commits into from
Feb 27, 2019

Conversation

nanne007
Copy link
Contributor

What problem does this PR solve?

Fix a npe when get store info of a non-exist stores in scatter range

What is changed and how it works?

We got a stack error in our online env.

2019/02/22 15:28:53.887 log.go:274: [fatal] panic: runtime error: invalid memory address or nil pointer dereference, stack: goroutine 6
2180022 [running]:
runtime/debug.Stack(0xc42caa76d8, 0xfe4e40, 0x18cf570)
        /data/program/tistack/tistack-release/local/go/src/runtime/debug/stack.go:24 +0xa7
github.com/pingcap/pd/pkg/logutil.LogPanic()
        /data/program/tistack/tistack-release/go/src/github.com/pingcap/pd/pkg/logutil/log.go:274 +0x5a
panic(0xfe4e40, 0x18cf570)
        /data/program/tistack/tistack-release/local/go/src/runtime/panic.go:505 +0x229
github.com/pingcap/pd/server/schedule.(*RangeCluster).updateStoreInfo(0xc421d0bb20, 0x0)
        /data/program/tistack/tistack-release/go/src/github.com/pingcap/pd/server/schedule/range_cluster.go:62 +0x26
github.com/pingcap/pd/server/schedule.(*RangeCluster).GetStore(0xc421d0bb20, 0x11f4b5, 0xc4277ca010)
        /data/program/tistack/tistack-release/go/src/github.com/pingcap/pd/server/schedule/range_cluster.go:81 +0x5d
github.com/pingcap/pd/server/schedulers.(*balanceLeaderScheduler).transferLeaderIn(0xc42619b060, 0xc4282f7030, 0x12af300, 0xc421d0bb20,
 0xc4282facc0, 0xc4282facf0, 0x0, 0x0, 0x0)
        /data/program/tistack/tistack-release/go/src/github.com/pingcap/pd/server/schedulers/balance_leader.go:139 +0x10b
github.com/pingcap/pd/server/schedulers.(*balanceLeaderScheduler).Schedule(0xc42619b060, 0x12af300, 0xc421d0bb20, 0xc4282facc0, 0xc4282
facf0, 0xc4282facc0, 0xc4282facf0, 0x20)
        /data/program/tistack/tistack-release/go/src/github.com/pingcap/pd/server/schedulers/balance_leader.go:101 +0x5e0
github.com/pingcap/pd/server/schedulers.(*scatterRangeScheduler).Schedule(0xc42a29c850, 0x12af180, 0xc42a3ce700, 0xc42ec8e6f0, 0xc42ec8
e720, 0xa, 0xc42a3ce700, 0x0)
        /data/program/tistack/tistack-release/go/src/github.com/pingcap/pd/server/schedulers/scatter_range.go:101 +0x205
github.com/pingcap/pd/server.scheduleByNamespace(0x12af000, 0xc424294960, 0x12a4380, 0xc421adf040, 0x12a62a0, 0xc42a29c850, 0xc42ec8e6f
0, 0xc42ec8e720, 0xc42ec8e720, 0x11f1b3, ...)
        /data/program/tistack/tistack-release/go/src/github.com/pingcap/pd/server/namespace_cluster.go:142 +0x13e
github.com/pingcap/pd/server.(*scheduleController).Schedule(0xc42ad3ed20, 0x12af000, 0xc424294960, 0xc42ec8e6f0, 0xc42ec8e720, 0xc42ec8
e6f0, 0xc42ec8e720, 0x0)
        /data/program/tistack/tistack-release/go/src/github.com/pingcap/pd/server/coordinator.go:689 +0xa3
github.com/pingcap/pd/server.(*coordinator).runScheduler(0xc42002f680, 0xc42ad3ed20)
        /data/program/tistack/tistack-release/go/src/github.com/pingcap/pd/server/coordinator.go:433 +0x2be
created by github.com/pingcap/pd/server.(*coordinator).addScheduler
        /data/program/tistack/tistack-release/go/src/github.com/pingcap/pd/server/coordinator.go:394 +0x16a

Check List

Tests

  • No code

Related changes

  • Need to cherry-pick to the release branch

@sre-bot
Copy link
Contributor

sre-bot commented Feb 22, 2019

Hi contributor, thanks for your PR.

This patch needs to be approved by someone of admins. They should reply with "/ok-to-test" to accept this PR for running test automatically.

1 similar comment
@sre-bot
Copy link
Contributor

sre-bot commented Feb 22, 2019

Hi contributor, thanks for your PR.

This patch needs to be approved by someone of admins. They should reply with "/ok-to-test" to accept this PR for running test automatically.

@codecov-io
Copy link

codecov-io commented Feb 22, 2019

Codecov Report

❗ No coverage uploaded for pull request base (master@c996046). Click here to learn what that means.
The diff coverage is 50%.

Impacted file tree graph

@@            Coverage Diff            @@
##             master    #1439   +/-   ##
=========================================
  Coverage          ?   67.73%           
=========================================
  Files             ?      158           
  Lines             ?    15182           
  Branches          ?        0           
=========================================
  Hits              ?    10283           
  Misses            ?     3978           
  Partials          ?      921
Impacted Files Coverage Δ
server/schedule/range_cluster.go 90.24% <50%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c996046...309a4de. Read the comment docs.

@siddontang
Copy link
Contributor

Thank you @lerencao

@siddontang
Copy link
Contributor

@nolouch

how can we meet this? seem we need to add a test.

@nanne007
Copy link
Contributor Author

after pd server restarts, not all region infos are updated, and region info's default store id is 0.
So NPE can appear when the code get store info of id 0.

@nolouch
Copy link
Contributor

nolouch commented Feb 22, 2019

@lerencao can you add this situation in the test?

@nanne007
Copy link
Contributor Author

@nolouch Yes, will work on it.

@nanne007 nanne007 force-pushed the fix/scatter-range-npe branch from 3cd3fde to e951fcb Compare February 25, 2019 10:13
@nanne007
Copy link
Contributor Author

@nolouch PTAL

Test cases and another nil check are added

@nolouch
Copy link
Contributor

nolouch commented Feb 25, 2019

@lerencao ci failed, may related to gofmt.

@nanne007 nanne007 force-pushed the fix/scatter-range-npe branch from e951fcb to 8048c95 Compare February 26, 2019 02:31
@nanne007
Copy link
Contributor Author

@nolouch updated.

tc.Regions.SetRegion(regionInfo)
}

for i := 1; i <= 3; i++ {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems we don't need it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the store info is not used in the underlining implementations, but it should be here to make things straight.

Copy link
Contributor

@nolouch nolouch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@rleungx rleungx merged commit 0690cea into tikv:master Feb 27, 2019
@nanne007 nanne007 deleted the fix/scatter-range-npe branch February 27, 2019 06:49
@sre-bot sre-bot added the contribution This PR is from a community contributor. label Dec 18, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
contribution This PR is from a community contributor.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants