Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adding reopen_duration feature #18

Merged
merged 4 commits into from
Jan 25, 2019

Conversation

raikakhodadad1
Copy link
Contributor

A new feature that was necessary for my use case while trying to keep the original functionality in place.

Added functionality so the user can set the number of days to determine whether a ticket with same labels should be re-opened or recreated. If the new value introduced is set to 0, it will keep the original functionality of re-opening the same ticket.

I have tested this over couple of months and it is working as expected in my use case.

notify.go Outdated
log.V(1).Infof(" found: %+v", issues[0])
return &issues[0], false, nil

lastIndex := len(issues) - 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see a couple of possible issues with this approach.

One, the results are currently ordered by key (see the query, on line 178 above) and I'm not sure that's ideal: it could be that issue FOO-1234 gets sorted before issue FOO-23 (not 100% sure, haven't tested, but that's not great). And it could be that there exists a matching non-resolved issue, even though it's not the last issue by key. I'd rather not open a second one.

Two, apparently Jira will only return up to 50 results by default for a Jira issue search. This means that once you get to 50+ issues for the same alert, you'd keep creating new issues every time the alert fires because the 50th result will soon become older than your reopen duration.

I've tried to navigate my way through Jira states and resolutions, but I'm getting nowhere and I'm close to banging my head against the keyboard.

So what I suggest instead is that you use ORDER BY resolutionDate DESC in the query, set options.MaxResults to 2 (just so you can log the "more than one issue matched" bit) and then grab the first search result and run with it (reopen it if necessary etc.). Ordering by resolutionDate in descending order appears to sort unresolved issues first (at least in my case) so you'll first get any unresolved issue or, failing that, the most recently resolved one.

notify.go Outdated
return r.reopen(issue.Key)

reopenDuration, err := time.ParseDuration(r.conf.ReopenDuration)
if err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check should happen at configuration load time, successfully loading the config and later failing to create issues because of a broken configuration on is bound to cause a lot of pain.

At the very least do this very check in (*ReceiverConfig) UnmarshalYAML and fail there (then remove it from here, as it's useless). Ideally I would copy-pasta Prometheus' marshaling/unmarshaling of its model.Duration (maybe without including the custom parsing, it's up to you). (The marshaling is needed because JIRAlert displays its configuration in the UI.)

@free
Copy link
Member

free commented Nov 7, 2018

Oh, and one more issue: the Travis CI build is failing because of a deprecated (due to your updating the go-jira dependency) way of authenticating with JIRA. The fix appears to be straightforward, according to the go-jira readme.

And thanks for contributing. I should have started with that.

@raikakhodadad1 raikakhodadad1 force-pushed the reopen_duration branch 2 times, most recently from 79a37f1 to f738baa Compare December 3, 2018 18:39
implementing changes recommended by owner for a new PR

implementing changes recommended by owner for a new PR
@raikakhodadad1
Copy link
Contributor Author

Included the recommendations suggested. Also made couple of minor changes in other places:

  1. line 95 in notify.go - I needed the component to be read from the template so I modified the code to allow that
  2. line 219/223 on notify.go I made a new pointer and repointed because the value of the pointer was getting lost previously and the ticket number would never print out

Copy link
Member

@free free left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for "importing" Prometheus' Duration and resolving the deprecation issue.

I do have a couple more comments though: mostly nitpicking, but also a critical one, regarding sorting of the query results.

notify.go Outdated

resolutionTime := time.Time(issue.Fields.Resolutiondate)
if resolutionTime.Add(time.Duration(*r.conf.ReopenDuration)).After(time.Now()) {
log.Infof("Issue %s for %s was resolved on %s, reopening", issue.Key, issueLabel, resolutionTime.Format(time.UnixDate))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find the time.UnixDate format to be quite confusing (the example given in the time documentation is "Mon Jan _2 15:04:05 MST 2006"). I'd prefer time.RFC3339 (e.g. "2006-01-02T15:04:05Z07:00"), the fields are better ordered and it's the standard most widely used.

If you don't want to do yet another change, let me know and I'll do it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's no problem I'll change this and resubmit - totally fair about it being confusing.

notify.go Outdated
@@ -166,8 +173,8 @@ func toIssueLabel(groupLabels alertmanager.KV) string {
func (r *Receiver) search(project, issueLabel string) (*jira.Issue, bool, error) {
query := fmt.Sprintf("project=\"%s\" and labels=%q order by key", project, issueLabel)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed earlier, the results should be sorted by resolutionDate, in descending order (so that the most recently resolved ones or the unresolved ones are returned first).

Suggested change
query := fmt.Sprintf("project=\"%s\" and labels=%q order by key", project, issueLabel)
query := fmt.Sprintf("project=\"%s\" and labels=%q order by resolutionDate desc", project, issueLabel)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I originally had this sorted by resolutiondate per your suggestion however in a use case where two tickets would be open simultaneously and they would get resolved not in order this wouldn't work (example ticket.123 opens first, ticket.234 opens second after reopen_duration has passed but ticket.234 gets closed before ticket.123). That's why I ordered based on the key. Although it should still be ordered by desc. In my original testing it seemed that jira always returned the higher key value but testing again today that seems to not be true. I will resubmit with "order by key desc". Let me know if you still think I should order by resolution date?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see one problem with the sequence of events you are describing: it sounds to me like ticket.234 was opened before ticket.123 was closed; JIRAlert should not do that by itself; if it did, then it's a bug that needs fixing separately.

If OTOH ticket.123 was opened then closed; reopen_duration passed; then ticket.234 was opened; ticket.123 was manually reopened; ticket.234 was closed; ticket.123 was closed. At this point it is totally reasonable for JIRAlert to open the last ticket to be closed, i.e. ticket.123.

To answer your question more clearly, I still believe the ordering should be by resolutionDate desc.

@@ -208,10 +216,11 @@ func (r *Receiver) reopen(issueKey string) (bool, error) {

func (r *Receiver) create(issue *jira.Issue) (bool, error) {
log.V(1).Infof("create: issue=%v", *issue)
issue, resp, err := r.client.Issue.Create(issue)
newIssue, resp, err := r.client.Issue.Create(issue)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any particular reason for this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually yep! Sorry I didn't explain in details. So in creating new issues, the ticket number would never print to stdout when I was testing. After tracing it I found out that it was a pointer issue. The old code was trying to change the memory address that the pointer pointed to which wasn't working. So to make sure the ticket number always gets printed out I just created a new value to store the data and then repointed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see. You mean the issue key/ID would never get logged at the very end of Notify(), on line 115 above. Thanks for the fix, that was my bug and I apparently never noticed it.

Copy link
Member

@free free left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for bearing with me, I still think you should use the sorting I suggested. Otherwise everything looks good.

@@ -208,10 +216,11 @@ func (r *Receiver) reopen(issueKey string) (bool, error) {

func (r *Receiver) create(issue *jira.Issue) (bool, error) {
log.V(1).Infof("create: issue=%v", *issue)
issue, resp, err := r.client.Issue.Create(issue)
newIssue, resp, err := r.client.Issue.Create(issue)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see. You mean the issue key/ID would never get logged at the very end of Notify(), on line 115 above. Thanks for the fix, that was my bug and I apparently never noticed it.

notify.go Outdated
@@ -166,8 +173,8 @@ func toIssueLabel(groupLabels alertmanager.KV) string {
func (r *Receiver) search(project, issueLabel string) (*jira.Issue, bool, error) {
query := fmt.Sprintf("project=\"%s\" and labels=%q order by key", project, issueLabel)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see one problem with the sequence of events you are describing: it sounds to me like ticket.234 was opened before ticket.123 was closed; JIRAlert should not do that by itself; if it did, then it's a bug that needs fixing separately.

If OTOH ticket.123 was opened then closed; reopen_duration passed; then ticket.234 was opened; ticket.123 was manually reopened; ticket.234 was closed; ticket.123 was closed. At this point it is totally reasonable for JIRAlert to open the last ticket to be closed, i.e. ticket.123.

To answer your question more clearly, I still believe the ordering should be by resolutionDate desc.

@raikakhodadad1
Copy link
Contributor Author

Sorry it took me a while to respond but changed and tested locally with ordering by resolutiondate.

@free
Copy link
Member

free commented Jan 25, 2019

There are a couple of static check failures, but they're all mine, so I get to fix them. Thanks for contributing and putting up with my review comments.

@free free merged commit 879ed63 into prometheus-community:master Jan 25, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants