Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Elastic Agent] Handle 429 response from the server and adjust backoff #19918

Merged
merged 3 commits into from
Jul 14, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions x-pack/elastic-agent/CHANGELOG.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -86,3 +86,4 @@
- Configuration cleanup {pull}19848[19848]
- Agent now sends its own logs to elasticsearch {pull}19811[19811]
- Add --insecure option to enroll command {pull}19900[19900]
- Will retry to enroll if the server return a 429. {pull}19918[19811]
4 changes: 2 additions & 2 deletions x-pack/elastic-agent/pkg/agent/application/fleet_gateway.go
Original file line number Diff line number Diff line change
Expand Up @@ -23,8 +23,8 @@ var defaultGatewaySettings = &fleetGatewaySettings{
Duration: 1 * time.Second, // time between successful calls
Jitter: 500 * time.Millisecond, // used as a jitter for duration
Backoff: backoffSettings{ // time after a failed call
Init: 5 * time.Second,
Max: 60 * time.Second,
Init: 60 * time.Second,
Max: 10 * time.Minute,
},
}

Expand Down
15 changes: 15 additions & 0 deletions x-pack/elastic-agent/pkg/agent/cmd/enroll.go
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ import (

"github.com/spf13/cobra"

"github.com/elastic/beats/v7/libbeat/common/backoff"
c "github.com/elastic/beats/v7/libbeat/common/cli"
"github.com/elastic/beats/v7/x-pack/elastic-agent/pkg/agent/application"
"github.com/elastic/beats/v7/x-pack/elastic-agent/pkg/agent/configuration"
Expand All @@ -20,6 +21,7 @@ import (
"github.com/elastic/beats/v7/x-pack/elastic-agent/pkg/cli"
"github.com/elastic/beats/v7/x-pack/elastic-agent/pkg/config"
"github.com/elastic/beats/v7/x-pack/elastic-agent/pkg/core/logger"
"github.com/elastic/beats/v7/x-pack/elastic-agent/pkg/fleetapi"
)

var defaultDelay = 1 * time.Second
Expand Down Expand Up @@ -116,6 +118,19 @@ func enroll(streams *cli.IOStreams, cmd *cobra.Command, flags *globalFlags, args
}

err = c.Execute()
signal := make(chan struct{})

backExp := backoff.NewExpBackoff(signal, 60*time.Second, 10*time.Minute)

for err == fleetapi.ErrTooManyRequests {
fmt.Fprintln(streams.Out, "Too many requests on the remote server, will retry in a moment.")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to print the wait time here? With 10min max someone might get impatient and think the beat is deadlocked

backExp.Wait()
fmt.Fprintln(streams.Out, "Retrying to enroll...")
err = c.Execute()
}

close(signal)

if err != nil {
return errors.New(err, "fail to enroll")
}
Expand Down
7 changes: 7 additions & 0 deletions x-pack/elastic-agent/pkg/fleetapi/enroll_cmd.go
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,9 @@ import (
// EnrollType is the type of enrollment to do with the elastic-agent.
type EnrollType string

// ErrTooManyRequests is received when the remote server is overloaded.
var ErrTooManyRequests = errors.New("too many requests received (429)")

const (
// PermanentEnroll is default enrollment type, by default an Agent is permanently enroll to Agent.
PermanentEnroll = EnrollType("PERMANENT")
Expand Down Expand Up @@ -190,6 +193,10 @@ func (e *EnrollCmd) Execute(ctx context.Context, r *EnrollRequest) (*EnrollRespo
}
defer resp.Body.Close()

if resp.StatusCode == http.StatusTooManyRequests {
return nil, ErrTooManyRequests
}

if resp.StatusCode != http.StatusOK {
return nil, extract(resp.Body)
}
Expand Down