Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invoke mutations to hold, release, and stop workflows #261

Closed
kinow opened this issue Sep 25, 2019 · 10 comments · Fixed by #307
Closed

Invoke mutations to hold, release, and stop workflows #261

kinow opened this issue Sep 25, 2019 · 10 comments · Fixed by #307
Labels
good first issue Good for newcomers
Milestone

Comments

@kinow
Copy link
Member

kinow commented Sep 25, 2019

Part of #135 , but smaller scope. The PR #257 is updating the toolbar, and leaving only the pause and stop buttons.

The rationale is that the initial implementation is not quite matching the scope of #135. It had some empty/disabled buttons, text, elements, but not functional. #257 will hide the not functional items.

And then this PR will hook up the click of the two buttons left behind, and stop or pause the workflow.

Having a smaller / simpler issue like this one will help us to prepare the Vue.js + ApolloClient set up, without having to go through all the CSS/styling necessary for the black panels in the sketch (#135).

High Priority, expected for Q1 2020, but the earlier we finish it the merrier.

@kinow kinow added this to the 0.2 milestone Sep 25, 2019
@kinow
Copy link
Member Author

kinow commented Sep 25, 2019

@dwsutherland has already confirmed on Riot that the backend part is ready for the mutations (double thanks!), so we just need to configure ApolloClient and check what/if we need to modify in the WorkflowService.

@dwsutherland
Copy link
Member

GraphiQL Docs are the place to learn about the Mutation layout and arguments (Not set in stone, we can change things if desired), however, here are some examples:

Mutations

-----------------------------------------------

mutation {
  stopWorkflow(
      command: "set_stop_after_clock_time",
      workflows: ["*|*"],
      args: {datetimeString: "2019-05-30T21:24:06+12:00"}) {
    result
  }
}

-----------------------------------------------

mutation{
  holdWorkflow(workflows: ["*|*"]) {
    result
  }
}
mutation{
  holdWorkflow(
      command: "hold_after_point_string",
      workflows: ["baz"],
      pointString: "20171201T0000+12") {
    result
  }
}
mutation {
  releaseWorkflow(workflows: ["baz"]){
    result
  }
}

-----------------------------------------------

mutation{
  putMessages(
      workflows: ["baz"],
      ids: ["20170101T0000+12|foo|01"],
      messages: [["CRITICAL", "HELLO, IS THERE ANYONE OUT THERE?"]]) {
    result
  }
}

-----------------------------------------------

mutation {
  taskActions(
      command: "insert_tasks",
      workflows: ["baz"],
      ids: ["20180501T0000+12|foo"]) {
    result
  }
}
mutation {
  taskActions(
      command: "trigger_tasks",
      workflows: ["baz"],
      ids: ["2017*|foo:failed"]) {
    result
  }
}

-----------------------------------------------

mutation{
  putBroadcast(
    workflows: ["baz"],
    namespaces: ["foo"],
    settings: [{environment: {DANGER: "Dr Robertson"}}]) {
    result
  }
}
mutation{
  clearBroadcast(
      workflows: ["baz"],
      namespaces: ["foo"],
      cancelSettings: [{environment: {DANGER: "Dr Robertson"}}]) {
    result
  }
}

And, of course, you can combine them into one docstring/request (although not all combinations might make sense in the internal Cylc command queue).

@kinow
Copy link
Member Author

kinow commented Oct 8, 2019

Learned a bit more about mutations in our schema while working on the PR's for subscription with the help of @dwsutherland , and also learned about the client side and mutations.

I think this will be easier to implement after #280 is merged, as then we will have the WorkflowService with WebSockets. Otherwise we will have to update that PR to fix the mutation calls, risking regressions.

@kinow kinow added the good first issue Good for newcomers label Oct 31, 2019
@kinow
Copy link
Member Author

kinow commented Nov 11, 2019

@dwsutherland today I executed my first mutation in Cylc! It was in GraphiQL, but it's still valid 😬

Not sure if I did something wrong. I was running the workflow five, and was planning on holding it, to confirm that the status "held" was returned in the next call to the GraphQL query to fetch the workflows data.

However, instead of staying put, with all task held, my workflow shut down almost immediately after the command was received in the server?

image

I thought it would work like cylc hold, which holds the workflow, but without shutting it down:

image

Then I tried again, now stopped the hub, stopped the workflow. Restarted the hub, then cylc run --no-detach five again. Executed again in GraphiQL:

mutation{
  holdWorkflow(workflows: ["five"]) {
    result
  }
}

This time it worked! 😕 ❓ Not sure if you've ever seen anything like this. I couldn't find anything in the suite logs (trying to find something in the job logs, but I have way too many logs for this workflow [EDIT: actually looks like only workflow logs are rotated, the job logs for the same task/cyclepoint are re-created, so I think I only have the last run logs])

@dwsutherland
Copy link
Member

dwsutherland commented Nov 11, 2019

@kinow - I haven't seen this behaviour before (the shutdown) with mutations, and the log showed that the hold command was queued so IDK (and queries hit the UIS)... We'd need more info to pinpoint where the issue's coming from (?)

@kinow
Copy link
Member Author

kinow commented Nov 11, 2019

@kinow - I haven't seen this behaviour before (the shutdown), and the log showed that the hold command was queued so IDK... We'd need more info to pinpoint where the issue's coming from (?)

That's what I thought too. I'll probably start working on this issue this week, and will trigger this several other times (both from GraphiQL and via JS). As I use the same workflow over and over, I'm hoping to see the same behaviour again, and then won't execute anything, and try to look at logs/DB to see if there's anything useful. Thanks!!!!

@dwsutherland
Copy link
Member

dwsutherland commented Nov 11, 2019

At a guess, it might be a ZMQ thing? mutation happened at the same time as a data-store update (which will be solved with the sync PR).

@kinow
Copy link
Member Author

kinow commented Nov 11, 2019

Hmm, that would make sense. We could have a try/except block somewhere hiding the error, or maybe we need to do something so ZMQ tells us when we have some error like this.

@dwsutherland
Copy link
Member

dwsutherland commented Nov 11, 2019

Hmm, that would make sense. We could have a try/except block somewhere hiding the error, or maybe we need to do something so ZMQ tells us when we have some error like this.

Let's just revisit after the data-store sync is in... (as it may not be an issue after that (unless multiple mutations are sent at once))

@kinow kinow mentioned this issue Nov 11, 2019
6 tasks
@kinow kinow changed the title Invoke mutations for pause and stop Invoke mutations to hold, release, and stop workflows Nov 11, 2019
@kinow
Copy link
Member Author

kinow commented Nov 11, 2019

Draft PR ready. @dwsutherland tested three mutations (hold/release/stop), passing variables. Everything worked with no issues on the first tentative 👏 👏 👏 Thanks heaps! I just need to finish testing and polishing the code a bit more, then it should be ready for review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants