fix watchablestore runlock bug #13505

LeoYang90 · 2021-11-25T12:11:42Z

tx.RLock()
revs, vs := tx.UnsafeRange(keyBucketName, minBytes, maxBytes, 0)
tx.RUnLock()

First we must know, the value(vs) got from function UnsafeRange is using shallow copy.
The pointer(of vs) comes from memory of bolt db.
So if bolt db is re-mmaping, the memory will be retrieved, then will trigger SIGSEGV:

{"log":"unexpected fault address 0x7ff096486221\r\n","stream":"stdout","time":"2021-11-22T05:58:18.896795868Z"}
{"log":"fatal error: fault\r\n","stream":"stdout","time":"2021-11-22T05:58:18.896824465Z"}
{"log":"[signal SIGSEGV: segmentation violation code=0x1 addr=0x7ff096486221 pc=0x472ee5]\r\n","stream":"stdout","time":"2021-11-22T05:58:18.90016673Z"}
{"log":"\r\n","stream":"stdout","time":"2021-11-22T05:58:18.900182456Z"}
{"log":"goroutine 335 [running]:\r\n","stream":"stdout","time":"2021-11-22T05:58:18.900185844Z"}
{"log":"runtime.throw(0x1035d04, 0x5)\r\n","stream":"stdout","time":"2021-11-22T05:58:18.900200492Z"}
{"log":"\u0009/usr/local/go/src/runtime/panic.go:1117 +0x72 fp=0xc0f035fac0 sp=0xc0f035fa90 pc=0x4381b2\r\n","stream":"stdout","time":"2021-11-22T05:58:18.900211528Z"}
{"log":"runtime.sigpanic()\r\n","stream":"stdout","time":"2021-11-22T05:58:18.900214574Z"}
{"log":"\u0009/usr/local/go/src/runtime/signal_unix.go:741 +0x268 fp=0xc0f035faf8 sp=0xc0f035fac0 pc=0x44f9c8\r\n","stream":"stdout","time":"2021-11-22T05:58:18.900217998Z"}
{"log":"runtime.memmove(0xc0c597a000, 0x7ff09647f0b4, 0x71ed)\r\n","stream":"stdout","time":"2021-11-22T05:58:18.90022098Z"}
{"log":"\u0009/usr/local/go/src/runtime/memmove_amd64.s:332 +0x3c5 fp=0xc0f035fb00 sp=0xc0f035faf8 pc=0x472ee5\r\n","stream":"stdout","time":"2021-11-22T05:58:18.900223571Z"}
{"log":"go.etcd.io/etcd/mvcc/mvccpb.(*KeyValue).Unmarshal(0xc0487ef3b0, 0x7ff09647f041, 0x7260, 0x7260, 0x5f, 0x0)\r\n","stream":"stdout","time":"2021-11-22T05:58:18.900238058Z"}
{"log":"\u0009/workspace/code-repo/mvcc/mvccpb/kv.pb.go:410 +0x8ae fp=0xc0f035fbb8 sp=0xc0f035fb00 pc=0x8bfeee\r\n","stream":"stdout","time":"2021-11-22T05:58:18.900258776Z"}
{"log":"go.etcd.io/etcd/mvcc.kvsToEvents(0xc00015e0c0, 0xc0003242c0, 0xc17c5a4000, 0x4af7, 0x5000, 0xc17c222000, 0x4af7, 0x5000, 0x11, 0x12, ...)\r\n","stream":"stdout","time":"2021-11-22T05:58:18.900290453Z"}
{"log":"\u0009/workspace/code-repo/mvcc/watchable_store.go:412 +0xf0 fp=0xc0f035fd30 sp=0xc0f035fbb8 pc=0xa66930\r\n","stream":"stdout","time":"2021-11-22T05:58:18.900294704Z"}
{"log":"go.etcd.io/etcd/mvcc.(*watchableStore).syncWatchers(0xc000324280, 0x0)\r\n","stream":"stdout","time":"2021-11-22T05:58:18.900309685Z"}
{"log":"\u0009/workspace/code-repo/mvcc/watchable_store.go:354 +0x8c9 fp=0xc0f035ff20 sp=0xc0f035fd30 pc=0xa66789\r\n","stream":"stdout","time":"2021-11-22T05:58:18.90032452Z"}
{"log":"go.etcd.io/etcd/mvcc.(*watchableStore).syncWatchersLoop(0xc000324280)\r\n","stream":"stdout","time":"2021-11-22T05:58:18.900328268Z"}
{"log":"\u0009/workspace/code-repo/mvcc/watchable_store.go:215 +0x1e8 fp=0xc0f035ffd8 sp=0xc0f035ff20 pc=0xa65588\r\n","stream":"stdout","time":"2021-11-22T05:58:18.900338151Z"}

serathius · 2021-11-25T14:46:20Z

Can you please add a test that would prevent future regression?

LeoYang90 · 2021-11-26T03:07:53Z

Can you please add a test that would prevent future regression?

It is difficult to reproduce this bug by test case, because the probability trigger this bug is low. Firstly, we must make bolt db re-mmap by filling a large number data.

I add some annotation in code firstly. Do you have some suggestion for test case?
@serathius

serathius · 2021-11-26T10:43:33Z

Not my area of expertise, but we are running unit tests with data race detection. Maybe we could add a test case that would trigger race in this case and depend on the data race detection to report error.

Based on the code I think the fix is correct, however I just want to make sure we are thinking long term how to avoid making same mistakes again.

ptabor · 2021-12-02T16:01:24Z

Great finding.
I wonder whether it can be cause of: #12845,
i.e. the page is being overwritten at the same time its being parsed.

wilsonwang371 · 2021-12-02T18:32:53Z

This looks like a correct bug fix. I have a patch for logging more information in case of failure. Shall we check in both changes and wait and see if we can get this reproduced?

#13499

@ptabor @chaochn47 @hexfusion @gyuho

wilsonwang371 · 2021-12-02T18:39:54Z

My only concern is that here it can significantly reduce the throughput. Shall we make a copy of the data and then call kvsToEvents? syncWatchers() is not a time-sensitive operation. @ptabor

wilsonwang371 · 2021-12-02T18:44:44Z

Another related issue: #13067

It makes sense that in our company, we have observed this in multiple versions of etcd but only recently we are seeing this more frequently. The reason might be because we have improved the etcd performance and there is a higher chance to see this happening.

ptabor · 2021-12-02T19:59:48Z

My only concern is that here it can significantly reduce the throughput. Shall we make a copy of the data and then call kvsToEvents? syncWatchers() is not a time-sensitive operation. @ptabor

Without benchmarking we will not know for sure (and your framework does not support (yet) RW transactions while watches are open, right ?)

The current kvsToEvents is in practice only parsing protos. I wouldn't consider this especially heavy operation in comparison to making an additional copy of [][]byte just to release shared lock faster.

wilsonwang371 · 2021-12-02T20:06:22Z

My only concern is that here it can significantly reduce the throughput. Shall we make a copy of the data and then call kvsToEvents? syncWatchers() is not a time-sensitive operation. @ptabor

Without benchmarking we will not know for sure (and your framework does not support (yet) RW transactions while watches are open, right ?)

The current kvsToEvents is in practice only parsing protos. I wouldn't consider this especially heavy operation in comparison to making an additional copy of [][]byte just to release shared lock faster.

Correct, this part is not covered in the script yet. Also, syncWatchers() is called when there are lots of write operations and the watchers are out of sync. When holding the lock, it might alleviate(?) some of the write workloads.

I am okay with this and we can wait and see if there is any feedback later in case of any performance issue.

wilsonwang371 · 2021-12-02T20:07:14Z

Btw, we may also need to backport this change to earlier versions too.

ahrtr · 2022-03-23T06:27:30Z

It's a great fix, but the changelog isn't updated. Just submit a PR to update the changelog.

fix runlock bug

7e6c29c

LeoYang90 force-pushed the fix_watchable_runlock branch from 8deda4c to 7e6c29c Compare November 26, 2021 03:06

wilsonwang371 mentioned this pull request Dec 2, 2021

server: remove KeyValue unmarshal failure panic and send corrupt alarm request instead #13426

Closed

wilsonwang371 added type/bug priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. labels Dec 2, 2021

ptabor approved these changes Dec 2, 2021

View reviewed changes

ptabor merged commit 29292aa into etcd-io:main Dec 3, 2021

serathius mentioned this pull request Dec 6, 2021

Plans for v3.5.2 release #13518

Closed

9 tasks

michaljasionowski mentioned this pull request Dec 16, 2021

Backport watchablestore runlock bug fix to release-3.5 #13541

Merged

serathius mentioned this pull request Feb 8, 2022

master branch, 3.5-pre have found panic: KeyValue: illegal tag #12845

Closed

wilsonwang371 mentioned this pull request Feb 9, 2022

etcd server DB out of sync undetected #12535

Closed

wilsonwang371 mentioned this pull request Mar 14, 2022

Inconsistent revision and data occurs #13766

Closed

ahrtr mentioned this pull request Mar 23, 2022

Update changelog to cover the fix for the watchablestore runlock bug #13835

Merged

tiankonglan mentioned this pull request Mar 31, 2022

docs: update how-to-build.md etcd version to 3.5.2 apache/apisix#6767

Closed

n549 mentioned this pull request Oct 3, 2022

etcd fails with "panic: failed to unmarshal mvccpb.KeyValue" error Netcracker/KubeMarine#241

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix watchablestore runlock bug #13505

fix watchablestore runlock bug #13505

LeoYang90 commented Nov 25, 2021 •

edited

Loading

serathius commented Nov 25, 2021

LeoYang90 commented Nov 26, 2021 •

edited

Loading

serathius commented Nov 26, 2021

ptabor commented Dec 2, 2021

wilsonwang371 commented Dec 2, 2021

wilsonwang371 commented Dec 2, 2021

wilsonwang371 commented Dec 2, 2021

ptabor commented Dec 2, 2021

wilsonwang371 commented Dec 2, 2021

wilsonwang371 commented Dec 2, 2021

ahrtr commented Mar 23, 2022

fix watchablestore runlock bug #13505

fix watchablestore runlock bug #13505

Conversation

LeoYang90 commented Nov 25, 2021 • edited Loading

serathius commented Nov 25, 2021

LeoYang90 commented Nov 26, 2021 • edited Loading

serathius commented Nov 26, 2021

ptabor commented Dec 2, 2021

wilsonwang371 commented Dec 2, 2021

wilsonwang371 commented Dec 2, 2021

wilsonwang371 commented Dec 2, 2021

ptabor commented Dec 2, 2021

wilsonwang371 commented Dec 2, 2021

wilsonwang371 commented Dec 2, 2021

ahrtr commented Mar 23, 2022

LeoYang90 commented Nov 25, 2021 •

edited

Loading

LeoYang90 commented Nov 26, 2021 •

edited

Loading