Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

run wishlist: expose its "specification" to the invoked process (RO or even RW) #3422

Closed
yarikoptic opened this issue May 15, 2019 · 1 comment
Labels
cmd-run/rerun Issues about the favorite command of ReproPeople enhancement stale-issue-closed-without-resolution

Comments

@yarikoptic
Copy link
Member

With datalad/datalad-container#84 we recently exposed container name, but greedy me now wants more! Not unlikely in git hooks, exposing details of run I think could open a wider of possibilities for custom shims. In my case (will be back-referenced later, yet to submit) I felt desire for

  • knowing if it was an --explicit run. (could be exposed as DATALAD_RUN_EXPLICIT)
  • in case of --explicit I felt I could benefit from knowin inputs/outputs. Probably the best approach could be to store the entire run record in some temp file which would then be exposed via e.g. DATALAD_RUN_RECORDFILE variable

and then I wanted even ability to be able to alter the run record:

  • shim might decide to alter run details, e.g. specify additional output files. One way is to alter the run record file (above)

I guess the issue might get populated more later...

yarikoptic added a commit to ReproNim/containers that referenced this issue May 15, 2019
…ive sessions

**Related**

This is a prototype for functionality which might be of interest
outside of this project, e.g. related:

- regular `datalad run` to record activities in the shell.

  - [`run --interactive`](datalad/datalad#2158 (comment))
  - [`run --shell`](datalad/datalad#2275)

  so here I am "implementing" it, solely for containerized environments ATM,
  via a "over the head" communication to the shim in environment variable

- `datalad run` for better record keeping, e.g.

  - [saving stdout/err](datalad/datalad#3385)

  so here I was not bothering to establish stdout/err capture but possibly
  could and might

- `reproman login`, or even `execute` (with or without --trace) and may be `run`
  where we could benefit from having an environment with a unified interface
  for interactive sessions which would also establish the record of activities

- just a regular shell environment to make a clear record of commands which were ran

- might eventually absorb/meld with the "opinionated .bashrc"
  proposed for the training curiculum:
  ReproNim/module-reproducible-basics#26
  which provides assistance/docs for more efficient use of cmdline
  and establishes 'infinite bash history'.

**reproshell???**

So it feels to me like a motivation for some kind of a  reproshell  independent
project which would be

- usable indepdendently and easily installable/bindable (e.g. into a container)
- parametrizeable to be invoked from the shim here and/or by datalad or reproman
  so could just take care about capturing all sidecar files into specified
  locations

**Could benefit from**

- knowing more about "datalad (containers-)run" invocation

Implemented now within `singularity_run` shim, which could have benefited
from having additional information about how exactly it was `ran` and
also to instruct datalad run "upstairs" that there is now an additional file in
[extra_outputs](datalad/datalad#3094).
Hence there is datalad/datalad#3422

- [`datalad run` being able to 'cover' multiple commits](datalad/datalad#3265)

Interactivity creates ambiguity for `rerun` semantic:

- run record ATM would say "reinvoke interactive session" which might be
  desireable on its own (e.g. to redo something manually in that original
  container)

- but for "automated reproducibility" we do have all information (bash history
  file, which is a list of commands to run) possibly recorded in another
  commit, which is ATM is not associated with the "run" record

So may be with somehow [tagging run
commits](datalad/datalad#3371) it could be possible
to disambiguate/select specific run commits/records?

**Example**

**Additional possible features which might come here into a prototype**

- color info/error messages from the shim
- indicate being [reproman --trace](ReproNim/reproman#416
- provide 'reactive' PS1 to alert user when he/she leaves the initial directory
  (thus the one outside of original dataset), possibly resulting in outputs which
  would not be recorded
yarikoptic added a commit to ReproNim/containers that referenced this issue May 15, 2019
…ive sessions

**Related**

This is a prototype for functionality which might be of interest
outside of this project, e.g. related:

- regular `datalad run` to record activities in the shell.

  - [`run --interactive`](datalad/datalad#2158 (comment))
  - [`run --shell`](datalad/datalad#2275)

  so here I am "implementing" it, solely for containerized environments ATM,
  via a "over the head" communication to the shim in environment variable

- `datalad run` for better record keeping, e.g.

  - [saving stdout/err](datalad/datalad#3385)

  so here I was not bothering to establish stdout/err capture but possibly
  could and might

- `reproman login`, or even `execute` (with or without --trace) and may be `run`
  where we could benefit from having an environment with a unified interface
  for interactive sessions which would also establish the record of activities

- just a regular shell environment to make a clear record of commands which were ran

- might eventually absorb/meld with the "opinionated .bashrc"
  proposed for the training curiculum:
  ReproNim/module-reproducible-basics#26
  which provides assistance/docs for more efficient use of cmdline
  and establishes 'infinite bash history'.

**reproshell???**

So it feels to me like a motivation for some kind of a  reproshell  independent
project which would be

- usable indepdendently and easily installable/bindable (e.g. into a container)
- parametrizeable to be invoked from the shim here and/or by datalad or reproman
  so could just take care about capturing all sidecar files into specified
  locations

**Could benefit from**

- knowing more about "datalad (containers-)run" invocation

Implemented now within `singularity_run` shim, which could have benefited
from having additional information about how exactly it was `ran` and
also to instruct datalad run "upstairs" that there is now an additional file in
[extra_outputs](datalad/datalad#3094).
Hence there is datalad/datalad#3422

- [`datalad run` being able to 'cover' multiple commits](datalad/datalad#3265)

Interactivity creates ambiguity for `rerun` semantic:

- run record ATM would say "reinvoke interactive session" which might be
  desireable on its own (e.g. to redo something manually in that original
  container)

- but for "automated reproducibility" we do have all information (bash history
  file, which is a list of commands to run) possibly recorded in another
  commit, which is ATM is not associated with the "run" record

So may be with somehow [tagging run
commits](datalad/datalad#3371) it could be possible
to disambiguate/select specific run commits/records?

<details>
<summary>**Example**</summary>

	(dev) 1 13348.....................................:Wed 15 May 2019 06:12:24 PM EDT:.
	(git-annex)hopa:~/proj/repronim/containers[enh-shell]git-annex
	$> SINGULARITY_CMD=shell datalad containers-run -n repronim-reproin
	[INFO   ] Making sure inputs are available (this may take some time)
	[INFO   ] == Command start (output follows) =====
	<ome/yoh/proj/repronim/containers$ echo "I will do something useful today"
	I will do something useful today
	singularity:repronim-reproin > yoh@hopa:/home/yoh/proj/repronim/containers$ touch my-results
	singularity:repronim-reproin > yoh@hopa:/home/yoh/proj/repronim/containers$ cd images/
	singularity:repronim-reproin > yoh@hopa:/home/yoh/proj/repronim/containers/images$ ls
	bids  README.md  repronim
	singularity:repronim-reproin > yoh@hopa:/home/yoh/proj/repronim/containers/images$ cd ../
	singularity:repronim-reproin > yoh@hopa:/home/yoh/proj/repronim/containers$ ls
	binds  images  LICENSE	my-results  README.md  scripts
	<pa:/home/yoh/proj/repronim/containers$ rm LICENSE ; echo 'nobody needs those'
	nobody needs those
	singularity:repronim-reproin > yoh@hopa:/home/yoh/proj/repronim/containers$ exit
	add(ok): .repronim/bash_histories/0.1-3-ge25c927-2019-05-15T18:12:37-04:00 (file)
	save(ok): . (dataset)
	action summary:
	  add (ok: 1)
	  save (ok: 1)
	[INFO   ] == Command exit (modification check follows) =====
	delete(ok): LICENSE (file)
	add(ok): my-results (file)
	save(ok): . (dataset)
	action summary:
	  add (ok: 1)
	  delete (ok: 1)
	  get (notneeded: 1)
	  save (ok: 1)
	SINGULARITY_CMD=shell datalad containers-run -n repronim-reproin  3.42s user 1.74s system 9% cpu 54.068 total

	$> git log --stat HEAD^^..
	commit 89fed08617418e5ddb88ae11ee2c14db699acf31 (HEAD -> enh-shell)
	Author: Yaroslav Halchenko <debian@onerussian.com>
	Date:   Wed May 15 18:13:28 2019 -0400

		[DATALAD RUNCMD] ./scripts/singularity_cmd run images/rep...

		=== Do not change lines below ===
		{
		 "chain": [],
		 "cmd": "./scripts/singularity_cmd run images/repronim/repronim-reproin--0.5.4.sing ",
		 "dsid": "b02e63c2-62c1-11e9-82b0-52540040489c",
		 "exit": 0,
		 "extra_inputs": [],
		 "inputs": [
		  "images/repronim/repronim-reproin--0.5.4.sing"
		 ],
		 "outputs": [],
		 "pwd": "."
		}
		^^^ Do not change lines above ^^^

	 LICENSE    | 201 ---------------------------------------------------------------------------------------------
	 my-results |   1 +
	 2 files changed, 1 insertion(+), 201 deletions(-)

	commit 5aa3b3383c2746f7c1d07ecdcc73852eb0a30f17
	Author: Yaroslav Halchenko <debian@onerussian.com>
	Date:   Wed May 15 18:13:28 2019 -0400

		[REPRONIM/CONTAINERS]: bash history for the interactive session

		Actual changes might (or not, depending on the invocation) get committed in the next commit

	 .repronim/bash_histories/0.1-3-ge25c927-2019-05-15T18:12:37-04:00 | 7 +++++++
	 1 file changed, 7 insertions(+)

	$> cat .repronim/bash_histories/0.1-3-ge25c927-2019-05-15T18:12:37-04:00
	echo "I will do something useful today"
	touch my-results
	cd images/
	ls
	cd ../
	ls
	rm LICENSE ; echo 'nobody needs those'

</details>

**Additional possible features which might come here into a prototype**

- color info/error messages from the shim
- improve PS1 (probably multiline -- too much in a single line to still be
  able edit commands)
- indicate being [reproman --trace](ReproNim/reproman#416
- provide 'reactive' PS1 to alert user when he/she leaves the initial directory
  (thus the one outside of original dataset), possibly resulting in outputs which
  would not be recorded
@mih mih added the cmd-run/rerun Issues about the favorite command of ReproPeople label Nov 7, 2021
@mih
Copy link
Member

mih commented Nov 7, 2021

Thank you for reporting this issue!

However, at this point I am closing it at a time of ~600 open issues, in an attempt to regain control over the issue tracker. This particular issue was posted more than 6 months ago, has not received a single response, and does not describe an immediate software defect. Closing this issue does not imply that it is not relevant. If any reader objects the closing of this issue, please feel more than free to reopen it with an update on how it is relevant for the current state of the project, and which concrete actions to address it are ongoing or planned.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cmd-run/rerun Issues about the favorite command of ReproPeople enhancement stale-issue-closed-without-resolution
Projects
None yet
Development

No branches or pull requests

2 participants