Skip to content

Commit

Permalink
Merge tag '0.4.0' into debian
Browse files Browse the repository at this point in the history
0.4.0 (May 29, 2019) -- run-baby-run

The minimum required DataLad version is now 0.11.5.

New features

- The call format gained the "{img_dspath}" placeholder, which expands
  to the relative path of the dataset that contains the image.  This
  is useful for pointing to a wrapper script that is bundled in the
  same subdataset as a container.

- `containers-run` now passes the container image to `run` via its
  `extra_inputs` argument so that a run command's "{inputs}" field is
  restricted to inputs that the caller explicitly specified.

- During execution, `containers-run` now sets the environment variable
  `DATALAD_CONTAINER_NAME` to the name of the container.

Fixes

- `containers-run` mishandled paths when called from a subdirectory.

- `containers-run` didn't provide an informative error message when
  `cmdexec` contained an unknown placeholder.

- `containers-add` ignores the `--update` flag when the container
  doesn't yet exist, but it confusingly still used the word "update"
  in the commit message.

* tag '0.4.0': (28 commits)
  Finalizing 0.4.0 changelog/version
  ENH: Kyle to CONTRIBUTORS
  MNT: setup.py: Bump require DataLad version for extra_inputs
  ENH: run: Move the image from inputs to extra_inputs
  RF: run: Use run_command()
  TST: Don't check content of /singularity
  CHANGELOG.md: Add entry for gh-86
  BF(TST): specify really bogus URL to containers_add
  BF: report "Update " in commit message only if an existing image was actually updated
  CHANGELOG.md: Add entry for gh-84
  ENH: set DATALAD_CONTAINER_NAME env var while running a container
  CHANGELOG.md: Prepare entries for 0.4.0
  ENH: run: Provide more informative message for cmdexec key error
  ENH: add: Log start of 'singularity build' call
  TST: Update test for dspath => parentds rename (e60b06b)
  ENH(DOC): mention {img_dspath} in the --call-fmt doc
  RF: dspath -> parentds, use resolved dataset (ds) not dataset variable
  BF: run: Preserve subdirectory call context
  Revert "RF(OPT): do not require any real container"
  RF(OPT): do not require any real container
  ...
  • Loading branch information
yarikoptic committed May 29, 2019
2 parents b5bcbce + bb07df8 commit 3fc556e
Show file tree
Hide file tree
Showing 10 changed files with 215 additions and 53 deletions.
29 changes: 29 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,35 @@ This is a high level and scarce summary of the changes between releases. We
would recommend to consult log of the [DataLad git
repository](http://github.com/datalad/datalad-container) for more details.

## 0.4.0 (May 29, 2019) -- run-baby-run

The minimum required DataLad version is now 0.11.5.

### New features

- The call format gained the "{img_dspath}" placeholder, which expands
to the relative path of the dataset that contains the image. This
is useful for pointing to a wrapper script that is bundled in the
same subdataset as a container.

- `containers-run` now passes the container image to `run` via its
`extra_inputs` argument so that a run command's "{inputs}" field is
restricted to inputs that the caller explicitly specified.

- During execution, `containers-run` now sets the environment variable
`DATALAD_CONTAINER_NAME` to the name of the container.

### Fixes

- `containers-run` mishandled paths when called from a subdirectory.

- `containers-run` didn't provide an informative error message when
`cmdexec` contained an unknown placeholder.

- `containers-add` ignores the `--update` flag when the container
doesn't yet exist, but it confusingly still used the word "update"
in the commit message.

## 0.3.1 (Mar 05, 2019) -- Upgrayeddd

### Fixes
Expand Down
1 change: 1 addition & 0 deletions CONTRIBUTORS
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
The following people have contributed to this project:

Benjamin Poldrack
Kyle Meyer
Michael Hanke
Yaroslav Halchenko
12 changes: 10 additions & 2 deletions datalad_container/containers_add.py
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,9 @@ class ContainersAdd(Interface):
doc="""Command format string indicating how to execute a command in
this container, e.g. "singularity exec {img} {cmd}". Where '{img}'
is a placeholder for the path to the container image and '{cmd}' is
replaced with the desired command.""",
replaced with the desired command. Additional placeholders:
'{img_dspath}' is relative path to the dataset containing the image.
""",
metavar="FORMAT",
constraints=EnsureStr() | EnsureNone(),
),
Expand Down Expand Up @@ -206,8 +208,10 @@ def __call__(name, url=None, dataset=None, call_fmt=None, image=None,
# collect bits for a final and single save() call
to_save = []
imgurl = url
was_updated = False
if url:
if update and op.lexists(image):
was_updated = True
# XXX: check=False is used to avoid dropping the image. It
# should use drop=False if remove() gets such an option (see
# DataLad's gh-2673).
Expand All @@ -233,6 +237,10 @@ def __call__(name, url=None, dataset=None, call_fmt=None, image=None,
raise ValueError("No basename in path {}".format(image))
if image_dir and not op.exists(image_dir):
os.makedirs(image_dir)

lgr.info("Building Singularity image for %s "
"(this may take some time)",
url)
runner.run(["singularity", "build", image_basename, url],
cwd=image_dir or None)
elif op.exists(url):
Expand Down Expand Up @@ -281,7 +289,7 @@ def __call__(name, url=None, dataset=None, call_fmt=None, image=None,
for r in ds.save(
path=to_save,
message="[DATALAD] {do} containerized environment '{name}'".format(
do="Update" if update else "Configure",
do="Update" if was_updated else "Configure",
name=name)):
yield r
result["status"] = "ok"
Expand Down
1 change: 1 addition & 0 deletions datalad_container/containers_list.py
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,7 @@ def __call__(dataset=None, recursive=False):
name=k,
type='file',
path=op.join(ds.path, v.pop('image')),
parentds=ds.path,
# TODO
#state='absent' if ... else 'present'
**v)
Expand Down
68 changes: 50 additions & 18 deletions datalad_container/containers_run.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
__docformat__ = 'restructuredtext'

import logging
import os
import os.path as op

from datalad.interface.base import Interface
Expand All @@ -12,13 +13,19 @@
from datalad.distribution.dataset import require_dataset
from datalad.interface.utils import eval_results

from datalad.interface.results import get_status_dict
from datalad.interface.run import Run
from datalad.interface.run import run_command
from datalad.interface.run import get_command_pwds
from datalad.interface.run import normalize_command
from datalad_container.find_container import find_container

lgr = logging.getLogger("datalad.containers.containers_run")

# Environment variable to be set during execution to possibly
# inform underlying shim scripts about the original name of
# the container
CONTAINER_NAME_ENVVAR = 'DATALAD_CONTAINER_NAME'

_run_params = dict(
Run._params_,
Expand Down Expand Up @@ -46,8 +53,15 @@ class ContainersRun(Interface):
A command is generated based on the input arguments such that the
container image itself will be recorded as an input dependency of
the command execution in the `run` record in the git history.
During execution the environment variable {name_envvar} is set to the
name of the used container.
"""

_docs_ = dict(
name_envvar=CONTAINER_NAME_ENVVAR
)

_params_ = _run_params

@staticmethod
Expand All @@ -56,12 +70,16 @@ class ContainersRun(Interface):
def __call__(cmd, container_name=None, dataset=None,
inputs=None, outputs=None, message=None, expand=None,
explicit=False, sidecar=None):
from mock import patch # delayed, since takes long (~600ms for yoh)
pwd, _ = get_command_pwds(dataset)
ds = require_dataset(dataset, check_installed=True,
purpose='run a containerized command execution')

container = find_container(ds, container_name)
image_path = op.relpath(container["path"], pwd)
# container record would contain path to the (sub)dataset containing
# it. If not - take current dataset, as it must be coming from it
image_dspath = op.relpath(container.get('parentds', ds.path), pwd)

# sure we could check whether the container image is present,
# but it might live in a subdataset that isn't even installed yet
Expand All @@ -83,25 +101,39 @@ def __call__(cmd, container_name=None, dataset=None,
raise ValueError(
'cmdexe {!r} is in an old, unsupported format. '
'Convert it to a plain string.'.format(callspec))

cmd = callspec.format(img=image_path, cmd=cmd)
try:
cmd_kwargs = dict(
img=image_path,
cmd=cmd,
img_dspath=image_dspath,
)
cmd = callspec.format(**cmd_kwargs)
except KeyError as exc:
yield get_status_dict(
'run',
ds=ds,
status='error',
message=(
'Unrecognized cmdexec placeholder: %s. '
'See containers-add for information on known ones: %s',
exc,
", ".join(cmd_kwargs)))
return
else:
# just prepend and pray
cmd = container['path'] + ' ' + cmd

# with amend inputs to also include the container image
inputs = (inputs or []) + [image_path]

# fire!
for r in Run.__call__(
cmd=cmd,
dataset=ds,
inputs=inputs,
outputs=outputs,
message=message,
expand=expand,
explicit=explicit,
sidecar=sidecar,
on_failure="ignore",
return_type='generator'):
yield r
with patch.dict('os.environ',
{CONTAINER_NAME_ENVVAR: container['name']}):
# fire!
for r in run_command(
cmd=cmd,
dataset=dataset or (ds if ds.path == pwd else None),
inputs=inputs,
extra_inputs=[image_path],
outputs=outputs,
message=message,
expand=expand,
explicit=explicit,
sidecar=sidecar):
yield r
39 changes: 26 additions & 13 deletions datalad_container/tests/test_containers.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,8 +38,11 @@ def test_add_noop(path):
ok_clean_git(ds.path)
# config will be added, as long as there is a file, even when URL access
# fails
res = ds.containers_add('broken', url='bogus', image='dummy',
on_failure='ignore')
res = ds.containers_add(
'broken',
url='bogus-protocol://bogus-server', image='dummy',
on_failure='ignore'
)
assert_status('ok', res)
assert_result_count(res, 1, action='save', status='ok')

Expand Down Expand Up @@ -150,23 +153,32 @@ def test_container_update(ds_path, local_file, url):
assert_result_count(res, 1, action="save", status="ok")
ok_file_has_content(op.join(ds.path, img), "bar")

# Test commit message
# In the above case it was updating existing image so should have "Update "
get_commit_msg = lambda *args: ds.repo.format_commit('%B')
assert_in("Update ", get_commit_msg())

# If we add a new image with update=True should say Configure
res = ds.containers_add(name="foo2", update=True, url=url_bar)
assert_in("Configure ", get_commit_msg())


@with_tempfile
@with_tempfile
@with_tree(tree={'some_container.img': "doesn't matter"})
def test_container_from_subdataset(ds_path, subds_path, local_file):
def test_container_from_subdataset(ds_path, src_subds_path, local_file):

# prepare a to-be subdataset with a registered container
subds = Dataset(subds_path).create()
subds.containers_add(name="first",
src_subds = Dataset(src_subds_path).create()
src_subds.containers_add(name="first",
url=get_local_file_url(op.join(local_file,
'some_container.img'))
)
# add it as subdataset to a super ds:
ds = Dataset(ds_path).create()
ds.install("sub", source=subds_path)
subds = ds.install("sub", source=src_subds_path)
# add it again one level down to see actual recursion:
Dataset(op.join(ds.path, "sub")).install("subsub", source=subds_path)
subds.install("subsub", source=src_subds_path)

# We come up empty without recursive:
res = ds.containers_list(recursive=False)
Expand All @@ -177,12 +189,14 @@ def test_container_from_subdataset(ds_path, subds_path, local_file):
assert_result_count(res, 2)

# default location within the subdataset:
target_path = op.join(ds_path, 'sub',
target_path = op.join(subds.path,
'.datalad', 'environments', 'first', 'image')
assert_result_count(
res, 1,
name='sub/first', type='file', action='containers', status='ok',
path=target_path)
path=target_path,
parentds=subds.path
)

# not installed subdataset doesn't pose an issue:
sub2 = ds.create("sub2")
Expand All @@ -195,10 +209,9 @@ def test_container_from_subdataset(ds_path, subds_path, local_file):
# subds:
res = ds.containers_list(recursive=True)
assert_result_count(res, 2)
# default location within the subdataset:
target_path = op.join(ds_path, 'sub',
'.datalad', 'environments', 'first', 'image')
assert_result_count(
res, 1,
name='sub/first', type='file', action='containers', status='ok',
path=target_path)
path=target_path,
parentds=subds.path
)
Loading

0 comments on commit 3fc556e

Please sign in to comment.