Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try to reconnect to Riemann if metrics upload failed. #1013

Closed
wants to merge 2 commits into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 15 additions & 3 deletions plugins/outputs/riemann/riemann.go
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
package riemann

import (
"errors"
"fmt"
"os"
"sort"
Expand Down Expand Up @@ -33,6 +32,7 @@ func (r *Riemann) Connect() error {
c, err := raidman.Dial(r.Transport, r.URL)

if err != nil {
r.client = nil
return err
}

Expand All @@ -41,7 +41,11 @@ func (r *Riemann) Connect() error {
}

func (r *Riemann) Close() error {
if r.client == nil {
return nil
}
r.client.Close()
r.client = nil
return nil
}

Expand All @@ -58,6 +62,13 @@ func (r *Riemann) Write(metrics []telegraf.Metric) error {
return nil
}

if r.client == nil {
err := r.Connect()
if err != nil {
return fmt.Errorf("FAILED to (re)connect to Riemann. Error: %s\n", err)
}
}

var events []*raidman.Event
for _, p := range metrics {
evs := buildEvents(p, r.Separator)
Expand All @@ -68,8 +79,9 @@ func (r *Riemann) Write(metrics []telegraf.Metric) error {

var senderr = r.client.SendMulti(events)
if senderr != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in here I would say just do:

r.Close()
return fmt.Errorf("FAILED to send riemann message (will try to reconnect). Error: %s\n",
                senderr)

because r.Close will set the connection to nil, and then a connection will be attempted on the next write.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. But will Telegraf try to re-attempt to send metrics again? Or they would be dropped if send fails?

Anyway, I removed immediate reconnection attempt on send failure.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the metrics will be cached, telegraf will re-attempt to send

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great, thanks for clarification.
should I squash PR into one commit, if it will be accepted for merge?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 commits is fine, I'll merge it

return errors.New(fmt.Sprintf("FAILED to send riemann message: %s\n",
senderr))
r.Close() // always retuns nil
return fmt.Errorf("FAILED to send riemann message (will try to reconnect). Error: %s\n",
senderr)
}

return nil
Expand Down