Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[tests/integration] flake test case TestCompactHashCheckDetectCorruption #14823

Closed
fuweid opened this issue Nov 22, 2022 · 0 comments · Fixed by #14824
Closed

[tests/integration] flake test case TestCompactHashCheckDetectCorruption #14823

fuweid opened this issue Nov 22, 2022 · 0 comments · Fixed by #14824

Comments

@fuweid
Copy link
Member

fuweid commented Nov 22, 2022

Which github workflows are flaking?

Tests

Which tests are flaking?

TestCompactHashCheckDetectCorruption

Github Action link

https://github.com/etcd-io/etcd/actions/runs/3356040544/jobs/5560786679

Reason for failure (if possible)

If the corrupted member has been elected as leader, the memberID in alert response won't be the corrupted one.
It will be a smaller follower ID since the raftCluster.Members always sorts by ID.

 logger.go:130: 2022-10-30T15:35:17.268Z	INFO	m0.raft	2a4384d1c9ba4399 received MsgVoteResp from 27f6db18bed0afb9 at term 3	{"member": "m0"}
    logger.go:130: 2022-10-30T15:35:17.268Z	INFO	m0.raft	2a4384d1c9ba4399 has received 2 MsgVoteResp votes and 0 vote rejections	{"member": "m0"}
    logger.go:130: 2022-10-30T15:35:17.268Z	INFO	m0.raft	2a4384d1c9ba4399 became leader at term 3	{"member": "m0"}
    logger.go:130: 2022-10-30T15:35:17.268Z	INFO	m0.raft	raft.node: 2a4384d1c9ba4399 elected leader 2a4384d1c9ba4399 at term 3	{"member": "m0"}
 logger.go:130: 2022-10-30T15:35:17.269Z	INFO	m2.raft	27f6db18bed0afb9 no leader at term 3; dropping index reading msg	{"member": "m2"}
    logger.go:130: 2022-10-30T15:35:17.269Z	INFO	m2.raft	raft.node: 27f6db18bed0afb9 elected leader 2a4384d1c9ba4399 at term 3	{"member": "m2"}
    logger.go:130: 2022-10-30T15:35:17.269Z	INFO	m1.raft	ea2efb1261ea4b7e [term: 2] received a MsgApp message with higher term from 2a4384d1c9ba4399 [term: 3]	{"member": "m1"}
    logger.go:130: 2022-10-30T15:35:17.269Z	INFO	m1.raft	ea2efb1261ea4b7e became follower at term 3	{"member": "m1"}
    logger.go:130: 2022-10-30T15:35:17.269Z	INFO	m1.raft	raft.node: ea2efb1261ea4b7e changed leader from 27f6db18bed0afb9 to 2a4384d1c9ba4399 at term 3	{"member": "m1"}
logger.go:130: 2022-10-30T15:35:17.273Z	INFO	m0	starting compact hash check	{"member": "m0", "local-member-id": "2a4384d1c9ba4399", "timeout": "5.2s"}
    logger.go:130: 2022-10-30T15:35:17.273Z	ERROR	m0	failed compaction hash check	{"member": "m0", "revision": 5, "leader-compact-revision": -1, "leader-hash": 2217896924, "follower-compact-revision": -1, "follower-hash": 2552317791, "follower-peer-id": "27f6db18bed0afb9"}
    logger.go:130: 2022-10-30T15:35:17.300Z	INFO	m0	set message encoder	{"member": "m0", "from": "2a4384d1c9ba4399", "to": "ea2efb1261ea4b7e", "stream-type": "stream Message"}
    logger.go:130: 2022-10-30T15:35:17.300Z	INFO	m0	established TCP streaming connection with remote peer	{"member": "m0", "stream-writer-type": "stream Message", "local-member-id": "2a4384d1c9ba4399", "remote-peer-id": "ea2efb1261ea4b7e"}
    logger.go:130: 2022-10-30T15:35:17.301Z	INFO	m0	set message encoder	{"member": "m0", "from": "2a4384d1c9ba4399", "to": "27f6db18bed0afb9", "stream-type": "stream Message"}
    logger.go:130: 2022-10-30T15:35:17.301Z	INFO	m0	established TCP streaming connection with remote peer	{"member": "m0", "stream-writer-type": "stream Message", "local-member-id": "2a4384d1c9ba4399", "remote-peer-id": "27f6db18bed0afb9"}
    logger.go:130: 2022-10-30T15:35:17.301Z	INFO	m0	set message encoder	{"member": "m0", "from": "2a4384d1c9ba4399", "to": "27f6db18bed0afb9", "stream-type": "stream MsgApp v2"}
    logger.go:130: 2022-10-30T15:35:17.301Z	INFO	m0	established TCP streaming connection with remote peer	{"member": "m0", "stream-writer-type": "stream MsgApp v2", "local-member-id": "2a4384d1c9ba4399", "remote-peer-id": "27f6db18bed0afb9"}
    logger.go:130: 2022-10-30T15:35:17.301Z	INFO	m1	established TCP streaming connection with remote peer	{"member": "m1", "stream-reader-type": "stream MsgApp v2", "local-member-id": "ea2efb1261ea4b7e", "remote-peer-id": "2a4384d1c9ba4399"}
    logger.go:130: 2022-10-30T15:35:17.301Z	INFO	m1	established TCP streaming connection with remote peer	{"member": "m1", "stream-reader-type": "stream Message", "local-member-id": "ea2efb1261ea4b7e", "remote-peer-id": "2a4384d1c9ba4399"}
    logger.go:130: 2022-10-30T15:35:17.301Z	INFO	m2	established TCP streaming connection with remote peer	{"member": "m2", "stream-reader-type": "stream Message", "local-member-id": "27f6db18bed0afb9", "remote-peer-id": "2a4384d1c9ba4399"}
    logger.go:130: 2022-10-30T15:35:17.301Z	INFO	m2	established TCP streaming connection with remote peer	{"member": "m2", "stream-reader-type": "stream MsgApp v2", "local-member-id": "27f6db18bed0afb9", "remote-peer-id": "2a4384d1c9ba4399"}
    logger.go:130: 2022-10-30T15:35:17.301Z	INFO	m0	set message encoder	{"member": "m0", "from": "2a4384d1c9ba4399", "to": "ea2efb1261ea4b7e", "stream-type": "stream MsgApp v2"}
    logger.go:130: 2022-10-30T15:35:17.302Z	INFO	m0	established TCP streaming connection with remote peer	{"member": "m0", "stream-writer-type": "stream MsgApp v2", "local-member-id": "2a4384d1c9ba4399", "remote-peer-id": "ea2efb1261ea4b7e"}
    corrupt_test.go:173: 
        	Error Trace:	corrupt_test.go:173
        	Error:      	Not equal: 
        	            	expected: []*etcdserverpb.AlarmMember{(*etcdserverpb.AlarmMember)(0xc00110f1a0)}
        	            	actual  : []*etcdserverpb.AlarmMember{(*etcdserverpb.AlarmMember)(0xc00110f0b0)}
        	            	
        	            	Diff:
        	            	--- Expected
        	            	+++ Actual
        	            	@@ -2,3 +2,3 @@
        	            	  (*etcdserverpb.AlarmMember)({
        	            	-  MemberID: (uint64) 3045423809600045977,
        	            	+  MemberID: (uint64) 2879729911077056441,
        	            	   Alarm: (etcdserverpb.AlarmType) 2,
        	Test:       	TestCompactHashCheckDetectCorruption

Anything else we need to know?

No response

fuweid added a commit to fuweid/etcd that referenced this issue Nov 22, 2022
If the corrupted member has been elected as leader, the memberID in alert
response won't be the corrupted one. It will be a smaller follower ID since
the raftCluster.Members always sorts by ID. We should check the leader
ID and decide to use which memberID.

Fixes: etcd-io#14823

Signed-off-by: Wei Fu <fuweid89@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging a pull request may close this issue.

1 participant