Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attach metrics to ProcessExecutor instead of the simulation Coordinator #5

Closed
PetrosPapapa opened this issue Sep 22, 2018 · 2 comments
Assignees
Labels
feature New feature or request

Comments

@PetrosPapapa
Copy link
Member

It would be useful to be able to track metrics for any executed workflow, not only simulated ones.

  • We can have ProcessExecutors send events of when things are happening: workflow start, process called, process returned, workflow completed
  • We can then register handlers (such as a MetricsAggregator) to record data from these events

This can allow us to have a real time timeline of the workflows that are executing, as well as other visualisations and monitoring capabilities.

This is quite a significant refactoring that should not break the existing executors (Akka, Mongo, Kafka).

@PetrosPapapa PetrosPapapa added the feature New feature or request label Sep 22, 2018
@PetrosPapapa
Copy link
Member Author

This is much more complicated than it sounds. Generic workflows don't have any notion of Task or TaskResource or Workflow in them. Some implementations may just not need this!

One idea is to create a subclass of AtomicProcess (and perhaps CompositeProcess as well?) that is aware of persistent resources. The Executors will be responsible for reporting their activity to the metrics trackers.

If these are a subclass of AtomicProcess, we will preserve compatibility for execution of generic workflows. However, this will make the Executor implementation more complex and potentially slower. We definitely don't want multiple implementations of the same Executor.

We could have Executors throw more generic events that make sense in any workflow and then introduce special handlers to play with resource-aware processes. This might be the way to go, and it will keep executors nice and modular without redundant implementation. We can also provide some metrics for generic workflows as well.

@PetrosPapapa
Copy link
Member Author

PetrosPapapa commented Oct 8, 2018

This naturally evolved into a much more generic problem of workflow monitoring. Here's an excerpt from the relevant team email:

In terms of how the system will be used:

I) Sometimes you run a workflow as a procedure and want to simply wait for a result. A listener will have to listen to events about a particular ID and fulfil a promise when done. You don't need persistence.

II) Sometimes you want to monitor a simulation, so everything that the executor is doing. You want to use custom listeners that capture additional metadata from the simulation. You don't need persistence, but it could help in the future.

III) Other times you want to monitor an executor through long periods of time, keep a log of all the events, but also be able to react to things happening. Here you might want both of the above types of listeners.
Some listeners are just logging what is happening. A user should be able to open a browser, connect to the executor via a listener and get a log and/or a visualisation in real time (or even back in history through the logs).
Other listeners want to capture the results and perhaps make additional decisions (e.g. a business rule that runs a "maintenance" workflow after 5 "machine usage" workflows have executed).
You need persistence for all of this!
I was hoping to make this platform-independent. If you decide to use the AkkaExecutor for simulation, maybe the Akka Streams (or simply the eventStream) is ideal. If you are using the KafkaExecutor, might as well use the Kafka Streams which are made for this.

What I'm going for now is a pub/sub API included in the ProcessExecutor trait. Each executor can decide how to implement this and what type of stream (or simple observer pattern) internally.

I then have events:
sealed trait PiEvent[KeyT] { ... }
(with a bunch of case classes extending this, and handlers/listeners:
trait PiEventHandler[KeyT] extends (PiEvent[KeyT]=>Unit) { ... }

The user can decide how to implement a listener without worrying about the underlying infrastructure.

Here's a promise handler that can be used for scenario (I):
T is the type of ID for a PiInstance (also platform-dependent).

class PromiseHandler[T](val id:T) extends PiEventHandler[T] {  
  val promise = Promise[Any]()
  def future = promise.future
 
  override def apply(e:PiEvent[T]) = if (e.id == this.id) e match {
    case PiEventResult(i,res) => promise.success(res)
    case PiEventFailure(i,reason) => promise.failure(reason)
    case PiEventException(id,reason) => promise.failure(reason)
    case PiEventProcessException(id,ref,reason) => promise.failure(reason)
    case _ => Unit
  } 
}

@JeVaughan raised the question of persistence for observers and the event stream. This is still under discussion.

PetrosPapapa added a commit that referenced this issue Oct 8, 2018
This is work in progress - code is broken

AkkaExecutor is now complying, but there is more to do and the tests are
broken.

See issue #5 for more information
@PetrosPapapa PetrosPapapa self-assigned this Nov 7, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant