This document describes the structure of a Flywheel Gear.
A Flywheel gear is a tar file (.tar) of a container; the container must include a specific directory that contains two special files.
This tar file can be created from most common container tools (e.g., Docker).
The only requirement for the underlying container is that it must be a *nix system that provides a bash shell on the path.
To be a Flywheel gear, the container in the tar file must include a folder named: /flywheel/v0
.
All following references to folders are relative to this folder.
The /flywheel/v0
folder contains two specific files:
manifest.json
- Describes critical properties of how the gear computes.run
- (Optional) Describes how to execute the algorithm in the gear. Alternately, use the manifestcommand
key.
The contents of these files are described here.
Here's an example manifest.json that specifies a Flywheel gear which reads one dicom file as input and specifies one configuration parameter. The keys listed in this example are all required, unless marked otherwise.
For other restrictions and required fields, you can view our manifest schema.
This document is a JSON schema, which allows for automatic validation of structured documents.
Note, the // comments
shown below are not JSON syntax and cannot be included in a real manifest file.
{
// Computer-friendly name; unique for your organization
"name": "example-gear",
// Human-friendly name; displayed in user interface
"label": "Example Gear",
// A brief description of the gear's purpose; ideally 1-4 sentences
"description": "A gear that performs a task.",
// Human-friendly version identifier
"version": "1.0",
// The author of this gear or algorithm.
"author": "Flywheel",
// (Optional) the maintainer, which may be distinct from the algorithm author.
// Can be the same as the author field if both roles were filled by the same individual.
"maintainer": "Nathaniel Kofalt",
// (Optional) Any citations you wish to add.
"cite": "",
// Must be an OSI-approved SPDX license string or 'Other'. Ref: https://spdx.org/licenses
"license": "Apache-2.0",
// Where to go to learn more about the gear. You can leave this blank.
"url": "http://example.example",
// Where to go for the source code, if applicable. You can leave this blank.
"source": "http://example.example/code",
// (Optional) Environment variables to set for the gear.
"environment": {
},
// (Optional) Command to execute. Ran inside a bash shell.
"command": "python script.py"
// Options that the gear can use
"config": {
// A name for this option to show in the user interface
"speed": {
"type": "integer",
// (Optional) json-schema syntax to provide further guidance
"minimum": 0,
"maximum": 3,
"description": "How fast do you want the gear to run? Choose 0-3."
},
"coordinates": {
"type": "array",
"items": {
"type": "number",
"minItems": 3,
"maxItems": 3,
},
"description": "A set of 3D coordinates."
},
},
// Inputs that the gear consumes
"inputs": {
// A label - describes one of the inputs. Used by the user interface and by the run script.
"dicom": {
// Specifies that the input is a single file. For now, it's the only type.
"base": "file",
// (Optional) json-schema syntax to provide further guidance
"type": { "enum": [ "dicom" ] },
"description": "Any dicom file."
},
// A contextual key-value, provided by the environment. Used for values that are generally the same for an entire project.
// Not guaranteed to be found - the gear should decide if it can continue to run, or exit with an error.
"matlab_license_code": {
"base": "context",
},
// An API key, specific to this job, with the same access as the user that launched the gear.
// Useful for aggregations, integrating with an external system, data analysis, or other automated tasks.
"key": {
"base": "api-key",
// (Optional) request that the API key only be allowed read access.
"read-only": true
}
},
// Capabilities the gear requires. Not necessary unless you need a specific feature.
"capabilities": [
"networking"
],
}
Each key of inputs
specifies an input to the gear.
At this point, the inputs are always files and the "base": "file"
is part of the specification.
Further constraints are an advanced feature, so feel free to leave this off until you want to pursue it. When present, they will be used to guide the user to give them hints as to which files are probably the right choice. In the example above, we add a constraint describing the type
of file. File types will be matched against our file data model.
The example has named one input, called "dicom", and requests that the file's type be dicom.
Each key of config
specifies a configuration option.
Like the inputs, you can add JSON schema constraints as desired. Please specify a type
on each key. Please only use non-object types: string
, integer
, number
, boolean
.
The example has named one config option, called "speed", which must be an integer between zero and three, and another called "coordinates", which must be a set of three floats.
In some cases, a configuration option may not have a safe default, and it only makes sense to sometimes omit it entirely. If that is the case, specify "optional": true
on that config key.
Context inputs are values that are generally provided by the environment, rather than the human or process running the gear. These are generally values that are incidentally required rather than directly a part of the algorithm - for example, a license key.
It is up to the gear executor to decide how (and if) context is provided. In the Flywheel system, the values can be provided by setting a context.key-name
value on a container's metadata. For example, you could set context.matlab_license_code: "AEX"
on the project, and then any gear running in that project with a context input called matlab_license_code
would receive the value.
Unlike a gear's config values, contexts are not guaranteed to exist or have a specific type or format. It is up to the gear to decide if it can continue, or exit with an error, when a context value does not match what the gear expects. In the example config file below, note that the found
key can be checked to determine if a value was provided by the environment.
Because context values are not namespaced, it is suggested that you use a specific and descriptive name. The matlab_license_code
example is a good, self-explanatory key that many gears could likely reuse.
It is possible to interact with the Flywheel data hierarchy using our python, matlab, and (slated for an overhaul) golang SDKs.
Generally, you will want to figure out a script that you like, using your normal user API key, before turning it into a gear. To do this, specify an api-key
type input as show in the example above, and use that value in your gear script.
The key provided will be a special key that has the same access as the running user (not necessarily the gear author), and only work while the job is running. After the job completes, the key is retired. This has write access by default, but you can make it read only by adding "read-only": true
to the manifest as shown above.
When a gear is executed, an input
folder will be created relative to the base folder. If a gear has anything previously existing in the input
folder it will be removed at launch time.
In this example, the input is called "dicom", and so will be in a folder inside input
called dicom
.
The full path would be, for example: /flywheel/v0/input/dicom/my-data.dcm
.
Inside the /flywheel/v0
folder a config.json
file will be provided with any settings the user has provided, and information about provided files. For example, if your gear uses the example manifest above, you'd get a file like the following:
{
"config": {
"speed": 2,
"coordinates": [1, 2, 3]
},
"inputs" : {
"dicom" : {
"base" : "file",
"hierarchy" : {
"type" : "acquisition",
"id" : "5988d38b3b49ee001bde0853"
},
"location" : {
"path" : "/flywheel/v0/input/dicom/example.dcm",
"name" : "example.dcm"
},
"object" : {
"info" : {},
"mimetype" : "application/octet-stream",
"tags" : [],
"measurements" : [],
"type" : "dicom",
"modality" : null,
"size" : 2913379
}
},
"matlab_license_code": {
"base": "context",
"found": true,
"value": "ABC"
}
}
}
Each configuration key will have been checked server-side against any constraints you specified, so you can be assured that your gear will be provided valid values.
The inputs
key will hold useful information about the files. For example, you can use inputs.dicom.path
to get the full path to the provided file. Also provided will be the location of the input in the hierarchy (if applicable) and any scientific information and metadata known at the time of the job creation. This inputs
key will currently only be present when running on the Flywheel system.
When a gear is executed, an output
folder will be created relative to the base folder. If a gear has anything previously existing in the output
folder it will be removed at launch time.
The gear should place any files that it wants saved into the output
folder - and only those files.
Anything in the output
folder when the gear is complete will be saved as a result.
If you don’t want results saved, it’s okay to leave this folder empty.
Do not remove the output
folder.
Optionally, a gear can provide metadata about the files it produces. This is communicated via creating a .metadata.json
file in the output folder.
{
"acquisition": {
"files": [
{
"name": "example.nii.gz",
"type": "nifti",
"instrument": "mri",
"metadata": {
"value1": "foo",
"value2": "bar"
}
}
]
}
}
If you are familiar with JSON schema you can look at our metadata schema here and our related file schema here. In this example, the file example.nii.gz
(which must exist in the output folder) is specified as being a nifti file from an MRI machine, with a few custom key/value pairs.
If you are curious about the typical file types, you can find a list of them here. You can also set metadata on the acquisition itself or its parent containers, though these features are less-used; see the metadata schema for more details.
As you might expect, gears cannot produce "normal" files called .metadata.json
, and might cause the gear to fail if the file is improperly formed.
By default, the gear is invoked by running is /flywheel/v0/run
. The file must be executable (chmod +x run
). It can be a bash or a Python script, or any other executable. If it is a script, please make sure to include appropriate shebang (#!
) leading line.
You can change this by setting the command
key of the manifest.
Your run script is the only entry point used for the gear and must accomplish everything the gear sets out to do. On success, exit zero. If the algorithm encounters a failure, exit non-zero. Ideally, print something out before exiting non-zero, so that your users can check the logs to see why things did not work.
An important consideration is the environment where the run
command is executed.
The command will be executed in the folder containing it (/flywheel/v0
), and with no environment variables save the PATH
:
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
Any required environment and path variables should be specified within the script. This can be important if, for example, you're producing a gear from a Dockerfile, as the variables there will not transfer over. We typically specify the path and environment variables in a .bashrc
file, and source that file at the beginning of the run
script.
The file is also executed with no arguments. You must specify the inputs to the executables in the run
script, such as the input file names or flags.
Capabilities allow for a gear to require certain environmental feature support.
Currently, the only capability available is networking
, which requires basic outbound networking as described below.
In the future, there will likely be more added: cuda
, hpc
, etc.
Do not add speculative capabilities to your manifest. Executors that do not recognize or cannot provide a specified capability are forbidden from launching the job.
Some gears may require outbound networking (to contact the Flywheel API, or for some other purpose).
If your gear needs this, please add the networking
capability to your manifest as shown in the example above.
For now, all gears are provided some networking, but this is not guaranteed, and may vary depending on your installation's setup. Adding the capability will future-proof your gear as it is likely this will be changed in the future.
Be sure to get in touch with us regarding your networking needs - Matlab license checks, for example.
There are no plans to allow inbound networking.
There is a final manifest key, custom
, that is entirely freeform. If you have additional information you'd like to record about your gear - possibly as a result of some toolchain or wrapper - this is a great place to put it.
An example:
{
"name": "gear-with-custom-info",
// ...
"custom": {
"generator": {
"generated-via": "antlr",
"credit": "Terence Parr",
"version": 4
}
}
In general, try to place your information under a single, top-level key, as in the example above.
We use some custom keys for notekeeping, or to enable features that might change in the future. In this way, we can offer functionality without the more onerous process of standardizing it & supporting in perpetuity.
In general, avoid using the custom keys custom.flywheel
or custom.gear-builder
. Here's a full list:
custom
docker-image
: This is used to tell the exchange what docker image to pull and archive.flywheel
invalid
: This disables a gear from running entirely (ref Queue.enqueue_job). Avoid using.module
: Requests a specific module to execute this gear. Currently onlyrunsc
is respected; all other values are ignored.suite
: This identifies a gear as part of a larger suite of tools, such as "FSL 5.0.10".
gear-builder
image
: The docker image to use as a base for the gear builder, if applicable.container
: The docker container to use as a base for the gear builder, if applicable.
Please don't hesitate to contact us with questions or comments about the spec at support@flywheel.io !