Add "shell script" processor #32

JeanMertz · 2019-07-12T17:35:55Z

We currently have a Shell Command processor that is easy to use for any simple shell command you want to run.

Its simplicity comes with limits in its capabilities though.

For example, you can only run a single command with zero or more arguments:

{
  "shellCommand": {
    "command": "echo",
    "arguments": ["hello", "world"]
  }
}

But you cannot pipe data from one command to the next, without using two processors:

{
  "shellCommand": {
    "command": "echo",
    // can't pipe output of `echo` to `grep`
    "arguments": ["hello", "world", "|", "grep", "world"]
  }
}

The simplicity of this processor make it a valuable tool for building simplistic tasks, but its limit prevent – or at least make it harder than needed – the construction of more powerful tasks that require shell access.

On top of that, even though having access to #23 is great for building tasks, you are limited in templating support in this processor, because each template's scope is limited to a single processor configuration value string (so echo, hello and world are each their own template scope in the first example, see also 858d44d).

By introducing a new Shell Script processor, we can keep the simplicity of the shell command processor, but also add a more flexible "free form" processor that also supports more extensive use of the templating support in Automaat.

The processor would be defined like this (mostly a copy/paste of the shell command processor, with a few changes):

pub struct ShellScript {
    /// The contents of the shell script to execute.
    pub script: String,

    /// The _current working directory_ in which the script is executed.
    ///
    /// This allows you to move to a child path within the [`Context`]
    /// workspace.
    ///
    /// If set to `None`, the root of the workspace is used as the default.
    ///
    /// [`Context`]: automaat_core::Context
    pub cwd: Option<String>,
}

You can then define your processor like this:

{
  "shellScript": {
    "script": "#!/bin/sh \n echo 'hello world'",
    "cwd": "path/to/data",
  }
}

The downside to this is that multi-line strings in JSON templates aren't great to work with.

I don't think it makes sense to help with this in the processor itself, but there are ways around this, for example by defining your processor configuration in YAML, and then converting them to JSON before submitting the processor configuration to Automaat using its API:

---
shellScript:
  cwd: path/to/data
  script: |
    #!/bin/sh
    echo "hello world"

$ yq read hello-world.yml --tojson | jq

{
  "shellScript": {
    "cwd": "path/to/data",
    "script": "#!/bin/sh\necho \"hello world\"\n"
  }
}

Some open questions:

Do we need the cwd configurable? We have it for the shell command processor, but in that case, you are using existing commands, that might not work unless you run them in a specific location. In this case, you are building your own script, and so your script can just cd into the appropriate folder. If it does turn out to be useful, we can always add it later as a non-breaking change.
Do we need to validate that a shebang is provided? Do we maybe want a separate configuration that has an explicit type to set the shebang by setting the script type to something like sh, bash, or ruby? Feels more restrictive, but also requires less work to validate that the script works.

The text was updated successfully, but these errors were encountered:

JeanMertz modified the milestone: v1.0.0 Jul 12, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add "shell script" processor #32

Add "shell script" processor #32

JeanMertz commented Jul 12, 2019

Add "shell script" processor #32

Add "shell script" processor #32

Comments

JeanMertz commented Jul 12, 2019