Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update helloworld3.md #269

Merged
merged 5 commits into from
Jan 12, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
59 changes: 29 additions & 30 deletions docs/codelab/beginner.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,7 @@ Please feel free to follow along using any of these resources:
Let's create a simple workflow that adds Netflix Idents to videos. We'll be mocking the adding Idents part and focusing on actually executing this process flow.

!!!info "What are Netflix Idents?"
Netflix Idents are those 4 second videos with Netflix logo, which appears at the beginning and end of shows.
Learn more about them [here](https://partnerhelp.netflixstudios.com/hc/en-us/articles/1500000260302-Overview-of-the-Netflix-Idents). You might have also noticed they're different for Animation and several other genres.
Netflix Idents are those 4-second videos with the Netflix logo that appear at the shows' beginning and end. Learn more about them [here](https://partnerhelp.netflixstudios.com/hc/en-us/articles/1500000260302-Overview-of-the-Netflix-Idents). You might have also noticed they're different for Animation and several other genres.

!!!warning "Disclaimer"
Obviously, this is not how Netflix adds Idents. Those Workflows are indeed very complex. But, it should give you an idea about how Conductor can be used to implement similar features.
Expand All @@ -24,14 +23,14 @@ The workflow in this lab will look like this:
This workflow contains the following:

* Worker Task `verify_if_idents_are_added` to verify if Idents are already added.
* [Switch Task](../reference-docs/switch-task/) that takes output from the previous task, and decides whether to schedule the `add_idents` task.
* `add_idents` task which is another worker Task.
* [Switch Task](../reference-docs/switch-task/) that takes the output from the previous task and decides whether to schedule the `add_idents` task.
* `add_idents` task, which is another worker Task.

### Creating Task definitions

Let's create the [task definition](/content/docs/getting-started/concepts/tasks-and-workers#task-definitions) for `verify_if_idents_are_added` in JSON. This task will be a *SIMPLE* task which is supposed to be executed by an Idents microservice. We'll be mocking the Idents microservice part.
Let's create the [task definition](/content/docs/getting-started/concepts/tasks-and-workers#task-definitions) for `verify_if_idents_are_added` in JSON. This task will be a *SIMPLE* task that is supposed to be executed by an Idents microservice. We'll be mocking the Idents microservice part.

**Note** that at this point, we don't have to specify whether it is a System task or Worker task. We are only specifying the required configurations for the task, like number of times it should be retried, timeouts etc. We shall start by using `name` parameter for task name.
**Note** that at this point, we don't have to specify whether it is a System task or a Worker task. We are only specifying the required configurations for the task, like the number of times it should be retried, timeouts, etc. We shall start by using the `name` parameter for the task name.
```json
{
"name": "verify_if_idents_are_added"
Expand All @@ -50,7 +49,7 @@ We'd like this task to be retried 3 times on failure.
```

And to timeout after 300 seconds.
i.e. if the task doesn't finish execution within this time limit after transitioning to `IN_PROGRESS` state, the Conductor server cancels this task and schedules a new execution of this task in the queue.
i.e., suppose the task doesn't finish execution within this time limit after transitioning to the `IN_PROGRESS` state. In that case, the Conductor server cancels this task and schedules a new execution of this task in the queue.

```json
{
Expand All @@ -77,7 +76,7 @@ And a [responseTimeout](/content/docs/how-tos/Tasks/task-lifecycle/#response-tim
}
```

We can define several other fields defined [here](/content/docs/getting-started/concepts/tasks-and-workers), but this is a good place to start with.
We can define several other fields [here](/content/docs/getting-started/concepts/tasks-and-workers), but this is a good place to start with.

Similarly, create another task definition: `add_idents`.

Expand All @@ -96,10 +95,10 @@ Similarly, create another task definition: `add_idents`.
Send a `POST` request to `/metadata/taskdefs` endpoint to register these tasks. You can use Swagger, Postman, CURL or similar tools.

!!!info "Why is the Switch Task not registered?"
System Tasks that are part of control flow do not need to be registered. However, some system tasks where the retries, rate limiting and other mechanisms are required, like `HTTP` Task, are to be registered though.
System Tasks that are part of the control flow do not need to be registered. However, some system tasks where retries, rate limiting and other mechanisms are required, like `HTTP` tasks, are to be registered though.

!!! Important
Task and Workflow Definition names are unique. The names we use below might have already been registered. For this lab, add a prefix with your username, `{my_username}_verify_if_idents_are_added` for example. This is definitely not recommended for Production usage though.
Task and Workflow Definition names are unique. The names we use below might have already been registered. For this lab, add a prefix with your username, `{my_username}_verify_if_idents_are_added`, for example. This is definitely not recommended for Production usage, though.


**Example**
Expand Down Expand Up @@ -133,7 +132,7 @@ curl -X POST \

### Creating Workflow Definition

Creating Workflow definition is almost similar. We shall use the Task definitions created above. Note that same Task definitions can be used in multiple workflows, or for multiple times in same Workflow (that's where `taskReferenceName` is useful).
Creating Workflow definition is almost similar. We shall use the Task definitions created above. Note that the same task definitions can be used in multiple workflows or multiple times in the same Workflow (that's where `taskReferenceName` is useful).

A workflow without any tasks looks like this:
```json
Expand Down Expand Up @@ -169,20 +168,20 @@ Add the first task that this workflow has to execute. All the tasks must be adde

**Wiring Input/Outputs**

Notice how we were using `${workflow.input.contentId}` to pass inputs to this task. Conductor can wire inputs between workflow and tasks, and between tasks.
i.e The task `verify_if_idents_are_added` is wired to accept inputs from the workflow input using JSONPath expression `${workflow.input.param}`.
Notice how we were using `${workflow.input.contentId}` to pass inputs to this task. Conductor can wire inputs between workflow and tasks and between tasks.
i.e., The task `verify_if_idents_are_added` is wired to accept inputs from the workflow input using JSONPath expression `${workflow.input.param}`.

Learn more about wiring inputs and outputs [here](/content/docs/getting-started/concepts/workflows).

Let's define `decisionCases` now.

>Note: in earlier versions of this tutorial, the "decision" task was used. This has been deprecated.

Checkout the Switch task structure [here](/content/docs/reference-docs/switch-task).
Check out the Switch task structure [here](/content/docs/reference-docs/switch-task).

A Switch task is specified by the `evaulatorType`, `expression` (the expression that defines the Switch) and `decisionCases` which lists all the branches of Switch task.
A Switch task is specified by the `evaulatorType`, `expression` (the expression that defines the Switch), and `decisionCases` which list all the branches of the Switch task.

In this case, we'll use `"evaluatorType": "value-param"`, meaning that we'll just use the value inputted to make the decision. Alternatively, there is a `"evaluatorType": "JavaScript"` that can be used for more complicated evaluations.
In this case, we'll use `"evaluatorType": "value-param"`, meaning that we'll just use the value inputted to make the decision. Alternatively, an `"evaluatorType": "JavaScript"` can be used for more complicated evaluations.

Adding the switch task (without any decision cases):
```json
Expand Down Expand Up @@ -334,26 +333,26 @@ curl -X POST \
}'
```

Successful POST request should return a workflow Id, which you can use to find the execution in the UI.
A successful POST request should return a workflow Id, which you can use to find the execution in the UI.

### Conductor User Interface

Open the UI and navigate to the RUNNING tab, the Workflow should be in the state as below:
Open the UI and navigate to the RUNNING tab; the Workflow should be in the state below:

![img](img/bgnr_state_scheduled.png)

Feel free to explore the various functionalities that the UI exposes. To elaborate on a few:

* Workflow Task modals (Opens on clicking any of the tasks in the workflow), which includes task I/O, logs and task JSON.
* Task Details tab, which shows the sequence of task execution, status, start/end time, and link to worker details which executed the task.
* Input/Output tab shows workflow input and output.
* Workflow Task modals ((Opens on clicking any of the workflow tasks), including task I/O, logs, and task JSON.
* The task Details tab shows the sequence of task execution, status, start/end time, and link to worker details that executed the task.
* The Input/Output tab shows workflow input and output.


### Poll for Worker task

Now that `verify_if_idents_are_added` task is in `SCHEDULED` state, it is the worker's turn to fetch the task, execute it and update Conductor with final status of the task.
Now that the `verify_if_idents_are_added` task is in the `SCHEDULED` state, it is the worker's turn to fetch the task, execute it and update Conductor with the final status of the task.

Ideally, the workers implementing the client interface would do this process, executing the tasks on real microservices. But, let's mock this part.
Ideally, the workers implementing the client interface would do this process, executing the tasks on real microservices. But let's mock this part.

Send a `GET` request to `/poll` endpoint with your task type.

Expand All @@ -373,7 +372,7 @@ We can respond to Conductor with any of the following states:
* Task has FAILED.
* Call back after seconds [Process the task at a later time].

Considering our Ident Service has verified that the Ident's are not yet added to given Content Id, let's return the task status by sending the below `POST` request to `/tasks` endpoint, with payload:
Considering our Ident Service has verified that the Idents are not yet added to the given Content Id, let's return the task status by sending the below `POST` request to `/tasks` endpoint, with payload:

```json
{
Expand Down Expand Up @@ -415,18 +414,18 @@ curl -X POST \
```

!!! Info "Check logs in UI"
You can find the logs we just sent by clicking the `verify_if_idents_are_added`, upon which a modal should open with `Logs` tab.
You can find the logs we just sent by clicking the `verify_if_idents_are_added`, upon which a modal should open with the `Logs` tab.

### Why is System task executed, but Worker task is Scheduled.
### Why is System task executed, but Worker task is Scheduled?

You will notice that Workflow is in the state as below after sending the POST request:
You will notice that Workflow is in the state below after sending the POST request:

![img](img/bgnr_systask_state.png)

Conductor has executed `is_idents_added` all through it's lifecycle, without us polling, or returning the status of Task. If it is still unclear, `is_idents_added` is a System task, and System tasks are executed by Conductor Server.
Conductor has executed `is_idents_added` all through its lifecycle without us polling or returning the status of the task. If it is still unclear, `is_idents_added` is a System task, and System tasks are executed by Conductor Server.

But, `add_idents` is a SIMPLE task. So, the complete lifecyle of this task (Poll, Update) should be handled by a worker to continue with W\workflow execution. When Conductor has finished executing all the tasks in given flow, the workflow will reach Terminal state (COMPLETED, FAILED, TIMED_OUT etc.)
But `add_idents` is a SIMPLE task. So, the complete lifecycle of this task (Poll, Update) should be handled by a worker to continue with W\workflow execution. When Conductor has finished executing all the tasks in a given flow, the workflow will reach the Terminal state (COMPLETED, FAILED, TIMED_OUT, etc.)

## Next steps

You can play around this workflow by failing one of the Tasks, restarting or retrying the Workflow, or by tuning the number of retries, timeoutSeconds etc.
You can play around with this workflow by failing one of the tasks, restarting or retrying the Workflow, or tuning the number of retries, timeoutSeconds, etc.
22 changes: 11 additions & 11 deletions docs/codelab/helloworld3.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,32 +4,32 @@ We've made it to Part 3! Thanks for keeping at it! What we've covered so far:

[Hello World Part 1](./helloworld) We created the Hello World Workflow.

[Hello World Part 2](./helloworld2) We created V2 of Hello World (learning about versioning) and added a HTTP Task to query information about the user's IP address.
[Hello World Part 2](./helloworld2) We created V2 of Hello World (learning about versioning) and added an HTTP Task to query information about the user's IP address.

<p align="center"><iframe width="560" height="315" src="https://www.youtube.com/embed/R8iYQKaD-1M" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe></p>

## Part 3

In Hello World Part 3, we'll introduce the [Fork](/content/docs/reference-docs/fork-task) and [Join](/content/docs/reference-docs/join-task) tasks to break our workflow into parallel tracks that run asynchronously, and then combine back into a single workflow.
In Hello World Part 3, we'll introduce the [Fork](/content/docs/reference-docs/fork-task) and [Join](/content/docs/reference-docs/join-task) tasks to break our workflow into parallel tracks that run asynchronously and then combine back into a single workflow.

## Where we stand

At the end of Part 2, our workflow appears as:
At the end of Part 2, our workflow appears as follows:


<p align="center"><img src="/content/img/codelab/hw2_workflowdiagram.png" alt="version 2 diagram" width="400" style={{paddingBottom: 40, paddingTop: 40}} /></p>


Now, these two tasks are very simple, and do not take long to run, but what if each of these workflows took several seconds to complete? The overall workflow processing time would take the sum of their execution times to complete.
Now, these two tasks are very simple and do not take long to run, but what if each of these workflows took several seconds to complete? The overall workflow processing time would take the sum of their execution times to complete.

Neither of these tasks are dependant on one another, and can run independently. In this section, we'll introduce the Fork & Join tasks that allows us to run independent tasks in parallel.
Neither of these tasks is dependent on one another and can run independently. In this section, we'll introduce the Fork & Join tasks that allows us to run independent tasks in parallel.


## Fork

The Fork ann Join tasks run on the Conductor server, and thus do not require a special task definition (or any unique identifier).
The Fork and Join tasks run on the Conductor server and thus do not require a special task definition (or any unique identifier).

Each 'tine' of the fork runs independently and concurrently to the other 'tines'. Each parallel set of tasks is defined as an array attribute inside the Fork task.
Each 'tine' of the fork runs independently and concurrently with the other 'tines'. Each parallel set of tasks is defined as an array attribute inside the Fork task.

Since ```hello_world``` and ```get_IP``` are independent, we can place them in separate parallel forks in version 3 of our workflow.

Expand All @@ -44,7 +44,7 @@ Changes made:

1. Version set to 3.
2. Fork Task added, and the existing ```hello_world_<uniqueId>``` and ```get_IP``` tasks are placed into arrays.
3. Join task is added, and the joinOn attributes set.
3. The join task is added, and the joinOn attributes are set.

``` json
{
Expand Down Expand Up @@ -123,17 +123,17 @@ When this version of the workflow is submitted, we have a new diagram showing th

## Running Version 3

We can now run the workflow version 3 with similar input. Since we didn't change the output, the response should be the same.
We can now run workflow version 3 with similar input. Since we didn't change the output, the response should be the same.

We'll leave running the workflow to the user to complete (but it is identical to part 2 if any issues arise).

## Next Steps

We've completed part 3 of the codelab.

In [Part 1](helloworld), we created a workflow using the Netflix Conductor in the Orkes Playground
In [Part 1](helloworld), we created a workflow using the Netflix Conductor in the Orkes Playground.

In [Part 2](helloworld2), we extended the workflow using versioning, and added a HTTP Task.
In [Part 2](helloworld2), we extended the workflow using versioning and added an HTTP Task.

In Part 3, we created parallel workflows using the FORK task.

Expand Down
Loading