Skip to content

Arcaflow is a highly-portable workflow engine enabling modular and validated pipelines through containerized plugins.

License

Notifications You must be signed in to change notification settings

arcalot/arcaflow-engine

Repository files navigation

Arcaflow: The Noble Workflow Engine

Arcaflow logo showing a waterfall and a river with
3 trees symbolizing the various plugins

Arcaflow is a highly-flexible and portable workflow system that helps you to build pipelines of actions via plugins. Plugin steps typically perform one action well, creating or manipulating data that is returned in a machine-readable format. Data is validated according to schemas as it passes through the pipeline in order to clearly diagnose type mismatch problems early. Arcaflow runs on your laptop, a jump host, or in a CI system, requiring only the Arcaflow engine binary, a workflow definition in YAML, and a compatible container runtime.

Complete Arcaflow Documentation


image

The Arcaflow Engine

The Arcaflow Engine is the core execution component for workflows. It allows you to use actions provided by containerized plugins to build pipelines of work. The Arcaflow engine can be configured to run plugins using Podman, Docker, and Kubernetes.

An ever-growing catalog of official plugins are maintained within the Arcalot organization and are available as versioned containers from Quay.io. You can also build your own containerized plugins using the the Arcaflow SDK, available for Python and Golang. We encourage you to contribute your plugins to the community, and you can start by adding them to the plugins incubator repo via a pull request.

Pre-built engine binaries

Our pre-built engine binaries are available in the releases section for multiple platforms and architectures.

Building from source

This system requires at least Go 1.18 to run and can be built from source:

go build -o arcaflow cmd/arcaflow/main.go

This self-contained engine binary can then be used to run Arcaflow workflows.

Running a simple workflow

A set of example workflows is available to demonstrate workflow features. A basic example workflow.yaml may look like this:

version: v0.2.0  # The compatible workflow schema version
input:  # The input schema for the workflow
  root: RootObject
  objects:
    RootObject:
      id: RootObject
      properties:
        name:
          type:
            type_id: string
steps:  # The individual steps of the workflow
  example:
    plugin:
      deployment_type: image
      src: quay.io/arcalot/arcaflow-plugin-example
    input:
      name: !expr $.input.name
outputs:  # The expected output schema and data for the workflow
  success:
    message: !expr $.steps.example.outputs.success.message

As you can see, a workflow has the root keys of version, input, steps, and outputs. Each of these keys is required in a workflow. Output values and inputs to steps can be specified using the Arcaflow expression language. Input and output references create dependencies between the workflow steps which determine their execution order.

An input YAML file for this basic workflow may look like this:

name: Arca Lot

The Arcaflow engine uses a configuration to define the standard behaviors for deploying plugins within the workflow. The default configuration will use Podman to run the container and will set the log outputs to the info level.

If you have a local Podman setup installed, you can simply run the workflow like this:

arcaflow --input input.yaml

This results in the default behavior of using the built-in configuration and reading the workflow from the workflow.yaml file in the current working directory.

If you don't have a local Podman setup, or if you want to use another deployer or any custom configuration parameters, you can create a config.yaml with your desired parameters. For example:

deployers:
  image: 
    deployer_name: docker
log:
  level: debug
logged_outputs:
  error:
    level: debug

You can load this config by passing the --config flag to Arcaflow.

arcaflow --input input.yaml --config config.yaml

The default workflow file name is workflow.yaml, but you can override this with the --workflow input parameter.

Arcaflow also accepts a --context parameter that defines the base directory for all input files. All relative file paths are from the context directory, and absolute paths are also supported. The default context is the current working directory (.).

A few command examples...

Use the built-in configuration and run the workflow.yaml file from the /my-workflow context directory with no input:

arcaflow --context /my-workflow

Use a custom my-config.yaml configuration file and run the my-workflow.yaml workflow using the my-input.yaml input file from the current directory:

arcaflow --config my-config.yaml --workflow my-workflow.yaml --input my-input.yaml

Use a custom config.yaml configuration file and the default workflow.yaml file from the /my-workflow context directory, and an input.yaml file from the current working directory:

arcaflow --context /my-workflow --config config.yaml --input ${PWD}/input.yaml

Deployers

Image-based deployers are used to deploy plugins to container platforms. Each deployer has configuraiton parameters specific to its platform. These deployers are:

There is also a Python deployer that allows for running Python plugins directly instead of containerized. Note that not all Python plugins may work with the Python deployer, and any plugin dependencies must be present on the target system.

About

Arcaflow is a highly-portable workflow engine enabling modular and validated pipelines through containerized plugins.

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Languages