Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

corruption alarm #8420

Merged
merged 11 commits into from
Aug 22, 2017
Merged

corruption alarm #8420

merged 11 commits into from
Aug 22, 2017

Conversation

heyitsanthony
Copy link
Contributor

Adds grpc to peers to get the hashkv rpc.

Fixes #7125

@gyuho
Copy link
Contributor

gyuho commented Aug 18, 2017

Would this cover #8313 as well?

@heyitsanthony
Copy link
Contributor Author

@gyuho no, but would #8313 depend on it. This only does periodic checks by the leader.

@xiang90
Copy link
Contributor

xiang90 commented Aug 18, 2017

@heyitsanthony We probably should put this under an alpha feature flag initially?

@@ -577,9 +577,11 @@ func (a *applierV3backend) Alarm(ar *pb.AlarmRequest) (*pb.AlarmResponse, error)
break
}

plog.Warningf("alarm %v raised by peer %+v", m.Alarm, types.ID(m.MemberID))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should just be %s? Becaue types.ID implements String method?

for _, c := range cli.Endpoints() {
ctx, cancel := context.WithTimeout(context.Background(), s.Cfg.ReqTimeout())
resp, herr := cli.HashKV(ctx, c, rev)
fmt.Printf("got %+v %v\n", resp, err)
Copy link
Contributor

@gyuho gyuho Aug 18, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Log instead of print? Or just delete?

@heyitsanthony heyitsanthony force-pushed the corrupt-alarm branch 11 times, most recently from 13588d7 to b0dfe76 Compare August 20, 2017 08:07
@heyitsanthony
Copy link
Contributor Author

Modest thrashing on this one. Putting cmux on the peer port on the integration package caused trouble due to connections hanging in the matcher phase on shutdown. I vendored the mainline cmux since it has a read timeout which eventually cancels the read.

@@ -216,6 +216,9 @@ func newConfig() *config {
// auth
fs.StringVar(&cfg.AuthToken, "auth-token", cfg.AuthToken, "Specify auth token specific options.")

// experimental
fs.DurationVar(&cfg.ExperimentalCorruptCheckTime, "experimental-corrupt-check-time", cfg.ExperimentalCorruptCheckTime, "Duration of time between cluster corruption check passes.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@gyuho gyuho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants