Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"No-transfer" pfcon: efficient single-machine/in-network ChRIS #136

Open
jennydaman opened this issue Jun 5, 2023 · 0 comments
Open

"No-transfer" pfcon: efficient single-machine/in-network ChRIS #136

jennydaman opened this issue Jun 5, 2023 · 0 comments

Comments

@jennydaman
Copy link
Collaborator

jennydaman commented Jun 5, 2023

Current Behavior

Currently, to run a plugin instance, CUBE submits a POST request to pfcon with two parts:

  1. a JSON describing the plugin and parameters
  2. a ZIP file containing input data

example of part 1:

{
    'jid': 456,
    'entrypoint': ['python3', '/usr/local/bin/simpledsapp'],
    'args': ['--saveinputmeta', '--saveoutputmeta'],
    'auid': 'jennings',
    'number_of_workers': '1',
    'cpu_limit': '1000',
    'memory_limit': '200',
    'gpu_limit': '0',
    'image': 'docker.io/fnndsc/pl-simpledsapp:latest',
    'type': 'ds'
}

When the plugin finishes:

  1. CUBE asks pfcon: "is plugin instance finished?"
  2. pfcon: "yes it's finished"
  3. CUBE: "give me a zip of the output files"
  4. pfcon: output_files.zip

Proposed Behavior

A recent CUBE feature FNNDSC/ChRIS_ultron_backEnd#516 enables a more efficient solution to be implemented: sending ZIPs to and from pfcon is no longer necessary in cases where CUBE and pman are able to mount the same directory as volumes.

Changes to CUBE

  • In CUBE, compute resources have another field mode which is one of: "zip", "filesystem"
  • When sending to and receiving files from pfcon where mode=zip: CUBE works as it currently does.

New Behavior: mode=filesystem

To run a plugin instance with a compute resource where mode=filesystem, CUBE submits to pfcon:

  1. a JSON describing the plugin and parameters (just like before)
  2. a relative path to existing input data and where to write output data to (NEW)

Example request:

{
    'jid': 456,
    'entrypoint': ['python3', '/usr/local/bin/simpledsapp'],
    'args': ['--saveinputmeta', '--saveoutputmeta'],
    'auid': 'jennings',
    'number_of_workers': '1',
    'cpu_limit': '1000',
    'memory_limit': '200',
    'gpu_limit': '0',
    'image': 'docker.io/fnndsc/pl-simpledsapp:latest',
    'type': 'ds',
    'input_dir': 'jennings/feed_4/pl-dircopy_6/pl-simpledsapp_456/data/incoming',
    'output_dir': 'jennings/feed_4/pl-dircopy_6/pl-simpledsapp_456/data/outgoing',
}

Importantly, no files are sent via ZIP.

When a plugin instance finishes, CUBE registers files simply by crawling the output directory using some equivalent of os.walk (or pathlib.Path.rglob('*'), FilesystemManager.ls, etc...)

Considerations: path, unextpath parameter types

unextpath parameters do not need special handling because they are never sent to the remote.

path parameters are possible to support either at the level of pfcon or pman by reading the parameter's value, which is a relative path to an existing directory in the shared volume. However, since path parameters are not used anywhere (and possibly candidate for deprecation), we do not need to implement this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant