Skip to content
This repository has been archived by the owner on Nov 8, 2022. It is now read-only.

Fixed tests in distributed task #1577

Merged

Conversation

IzabellaRaulin
Copy link
Contributor

@IzabellaRaulin IzabellaRaulin commented Mar 28, 2017

Fixes #1576

The following failures are fixed in scheduler/distributed_task_test.go:

image

The following failures are fixed in pkg/schedule/windowed_schedule_medium_test.go:

* /Users/egu/.gvm/pkgsets/go1.7/snap-1.7/src/github.com/intelsdi-x/snap/pkg/schedule/windowed_schedule_medium_test.go
  Line 471:
  Expected: '1'
  Actual:   '0'
  (Should be equal)

  * /Users/egu/.gvm/pkgsets/go1.7/snap-1.7/src/github.com/intelsdi-x/snap/pkg/schedule/windowed_schedule_medium_test.go
  Line 507:
  Expected: '10'
  Actual:   '7'
  (Should be equal)

 * /home/travis/gopath/src/github.com/intelsdi-x/snap/pkg/schedule/windowed_schedule_medium_test.go 
  Line 553:
  Expected: '1'
  Actual:   '0'
  (Should be equal)

Summary of changes:

  • added PluginsUnsubscribed event and listener to this event in distributed_workflow_test.go -> react on incoming PluginsUnsubscribed event fixes the flaky tests in distributed_workflow_tests.go

  • added methods to proceed task disabling -> it's done couple times in different places, it's rather improving code readability, do not impact on how it behaves

  • increase an interval to be sure that calculated stop_timestamp (equals to interval multiplied by count) does not pass before test starts -> this fixes tests in windowed_schedule_medium_test.go

Testing done:

  • manual tests
  • medium, legacy and small tests

@intelsdi-x/snap-maintainers

@@ -761,9 +761,6 @@ func (s *scheduler) HandleGomitEvent(e gomit.Event) {
"event-namespace": e.Namespace(),
"task-id": v.TaskID,
}).Debug("event received")
// We need to unsubscribe from deps when a task has ended
task, _ := s.getTask(v.TaskID)
task.UnsubscribePlugins()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To reviewers:
Now this happens before emitting the event - see scheduler/task.go

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it will be nice if the HandleGomitEvent function handles the unsubscription of plugins when the task is ended. This way, we are ensuring that these happen in order: change state, emit the TaskEnded event, have the event handler call UnsubscribePlugins(), which seems more logical. Do you agree?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I will revert that

@@ -773,9 +770,6 @@ func (s *scheduler) HandleGomitEvent(e gomit.Event) {
"task-id": v.TaskID,
"disabled-reason": v.Why,
}).Debug("event received")
// We need to unsubscribe from deps when a task goes disabled
task, _ := s.getTask(v.TaskID)
task.UnsubscribePlugins()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To reviewers:
Now this happens before emitting the event - see scheduler/task.go

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above.
To keep it consistent, we should have all three different cases (Ended, Disabled and Stopped), follow the same pattern, which is to emit the event first and handle unsubscription of plugins.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, reverted

return ErrTaskDisabledOnFailures
}
return nil
}
Copy link
Contributor Author

@IzabellaRaulin IzabellaRaulin Mar 28, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To reviewers
I see the reasoning of having checkTaskFailures() to avoid duplication of the same code, but the error log loses its value because this function might be called everywhere, so "_block" does not point to the exact place. So, I changed it to the following:

if t.stopOnFailure >= 0 && consecutiveFailures >= t.stopOnFailure {
		taskLogger.WithFields(log.Fields{
			"_block":               "stream",
			"task-id":              t.id,
			"task-name":            t.name,
			"consecutive failures": consecutiveFailures,
			"error":                t.lastFailureMessage,
		}).Error(ErrTaskDisabledOnFailures)
		// disable the task
		t.disable(t.lastFailureMessage)
		return
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@iza, probably "_block" input can be passed in so that code may be reused? Of course, it's everywhere already. Probably it's no big deal for reuse 1 0r 2. If we can refactor to standardized Snap's logs in the future.

Copy link
Contributor Author

@IzabellaRaulin IzabellaRaulin Apr 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@candysmurf - yes I fully agree. There is still a way to make some small optimization is this area -
but if we decide to do such thing it should be done across all files I suppose. So, is it ok for you as it is now?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

totally, 👍

@IzabellaRaulin IzabellaRaulin force-pushed the fixed_tests_in_distributed_task branch from 35e8243 to 5c719dd Compare March 30, 2017 19:22
@@ -761,9 +761,6 @@ func (s *scheduler) HandleGomitEvent(e gomit.Event) {
"event-namespace": e.Namespace(),
"task-id": v.TaskID,
}).Debug("event received")
// We need to unsubscribe from deps when a task has ended
task, _ := s.getTask(v.TaskID)
task.UnsubscribePlugins()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it will be nice if the HandleGomitEvent function handles the unsubscription of plugins when the task is ended. This way, we are ensuring that these happen in order: change state, emit the TaskEnded event, have the event handler call UnsubscribePlugins(), which seems more logical. Do you agree?

@@ -773,9 +770,6 @@ func (s *scheduler) HandleGomitEvent(e gomit.Event) {
"task-id": v.TaskID,
"disabled-reason": v.Why,
}).Debug("event received")
// We need to unsubscribe from deps when a task goes disabled
task, _ := s.getTask(v.TaskID)
task.UnsubscribePlugins()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above.
To keep it consistent, we should have all three different cases (Ended, Disabled and Stopped), follow the same pattern, which is to emit the event first and handle unsubscription of plugins.

defer t.eventEmitter.Emit(event)

// We need to unsubscribe from deps when a task has ended
t.UnsubscribePlugins()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we catch errors here?

Copy link
Contributor

@candysmurf candysmurf Apr 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we allow ended task be restarted? If the answer is yes, do we subscribe plugins in the restart? If no impact, adding error handling.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above

defer t.eventEmitter.Emit(event)

// We need to unsubscribe from deps when a task has disabled
t.UnsubscribePlugins()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should catch the errors here?

Copy link
Contributor

@candysmurf candysmurf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@IzabellaRaulin, thanks for working on this one. The fundamental question is if this change will change the Snap task flow of subscribe/unsubscribe plugins. Otherwise, I like the change.

return ErrTaskDisabledOnFailures
}
return nil
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@iza, probably "_block" input can be passed in so that code may be reused? Of course, it's everywhere already. Probably it's no big deal for reuse 1 0r 2. If we can refactor to standardized Snap's logs in the future.

defer t.eventEmitter.Emit(event)

// We need to unsubscribe from deps when a task has disabled
t.UnsubscribePlugins()
Copy link
Contributor

@candysmurf candysmurf Apr 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@iza, sounds a good change. Only concern is that the disabled task can be enabled. Do we need to subscribe it before enabling it? Could you double check. If no impact, adding error handling.

Copy link
Contributor Author

@IzabellaRaulin IzabellaRaulin Apr 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Answer to

Only concern is that the disabled task can be enabled. Do we need to subscribe it before enabling it?

The disabled task can be enabled and yes, we need to subscribe it before enabling it (exactly it is a part of enabling procedure). To be clear this PR does not impact on such behavior.

adding error handling

There was a discussion about what should happen first - emitting an event or unsubscribe deps. The decision is to keep it as it is what means emitting event as first, and then do unsubscribing in HandleGomitEvent. I will revert my changes corresponding to this aspect.

defer t.eventEmitter.Emit(event)

// We need to unsubscribe from deps when a task has ended
t.UnsubscribePlugins()
Copy link
Contributor

@candysmurf candysmurf Apr 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we allow ended task be restarted? If the answer is yes, do we subscribe plugins in the restart? If no impact, adding error handling.

@candysmurf
Copy link
Contributor

@IzabellaRaulin, tried to test your fix on my laptop. It didn't fix the issue. Here are errors for medium-test.

Failures:

  * /Users/egu/.gvm/pkgsets/go1.7/snap-1.7/src/github.com/intelsdi-x/snap/pkg/schedule/windowed_schedule_medium_test.go
  Line 471:
  Expected: '1'
  Actual:   '0'
  (Should be equal)

  * /Users/egu/.gvm/pkgsets/go1.7/snap-1.7/src/github.com/intelsdi-x/snap/pkg/schedule/windowed_schedule_medium_test.go
  Line 507:
  Expected: '10'
  Actual:   '7'
  (Should be equal)


47 total assertions

- Adds event for when plugins are unsubscribed
- Listens for unsubscription event before asserting
@IzabellaRaulin IzabellaRaulin force-pushed the fixed_tests_in_distributed_task branch 4 times, most recently from 491c517 to c96203d Compare April 5, 2017 19:34
@IzabellaRaulin
Copy link
Contributor Author

Update done - please see also updated PR description. All identified flaky tests related with schedule fixed.
It's ready to be re-reviewed.

cc: @intelsdi-x/snap-maintainers

Copy link
Collaborator

@jcooklin jcooklin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jcooklin jcooklin merged commit 7a929f5 into intelsdi-x:master Apr 6, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Flaky tests in scheduler/distributed_task_test.go
4 participants