Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent cylc clean from removing workflow base dirs containing run dirs/named runs #4289

Merged
merged 11 commits into from
Aug 12, 2021

Conversation

MetRonnie
Copy link
Member

@MetRonnie MetRonnie commented Jul 2, 2021

These changes close #4163 (assuming we're happy not having an --all option and will instead wait until universal ID is implemented, allowing cylc clean foo/*)

Explanation

If you have

~/cylc-run
`-- foo
    `-- run1
        |-- flow.cylc
            etc

Up til now, doing cylc clean foo would just delete ~/cylc-run/foo, and if you had any symlink dirs or remote installations, they wouldn't get cleaned.

With this PR, it now raises an error if you try to clean a dir containing multiple (>1) runs - you have to explicitly type out the run dir. However:

  • If the dir contains 1 run dir, it will clean it (cleans foo/run1 followed by deleting foo)
  • There is now the option --force to allow possibly incomplete cleaning of the specified dir (e.g. it will just delete the foo dir)

Requirements check-list

  • I have read CONTRIBUTING.md and added my name as a Code Contributor.
  • Contains logically grouped changes (else tidy your branch by rebase).
  • Does not contain off-topic changes (use other PRs for other changes).
  • Applied any dependency changes to both setup.py and
    conda-environment.yml.
  • Appropriate tests are included (unit).
  • Appropriate change log entry included.
  • Documentation will be handled by Document cylc clean cylc-doc#256

… a workflow

All workflow run dirs contain a log/workflow/log dir, which was getting 
erroneously identified as a workflow.
Solution was to exit from the generator if the scan dir itself is a 
workflow run dir.
@MetRonnie MetRonnie added this to the cylc-8.0b2 milestone Jul 2, 2021
@MetRonnie MetRonnie self-assigned this Jul 2, 2021
@MetRonnie

This comment has been minimized.

cylc/flow/workflow_files.py Outdated Show resolved Hide resolved
@pytest.fixture(scope='session')
def badly_messed_up_run_dir():
tmp_path = Path(TemporaryDirectory().name)
tmp_path.mkdir()
@pytest.fixture
Copy link
Member Author

@MetRonnie MetRonnie Jul 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fixture only used once so I've made it function-scoped to allow monkeypatching

Copy link
Member

@wxtim wxtim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Read, all makes sense.
  • Tried it out → Working.

cylc/flow/network/scan.py Show resolved Hide resolved
cylc/flow/workflow_files.py Outdated Show resolved Hide resolved
cylc/flow/workflow_files.py Outdated Show resolved Hide resolved
f"Cannot remove running workflow.\n\n{exc}")
raise ServiceFileError(f"Cannot remove running workflow.\n\n{exc}")
# Check dir does not contain other workflows:
contained_workflows = asyncio.run(get_contained_workflows(run_dir))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is ok (if it wasn't I would expect all the tests to fail), however, could cause bugs later on.

The convenience function asyncio.run function creates a new event loop and closes it, I think that can cause issues when other event loops are kicking about.

This function cannot be called when another asyncio event loop is running in the same thread.

If debug is True, the event loop will be run in debug mode.

This function always creates a new event loop and closes it at the end. It should be used as a main entry point for asyncio programs, and should ideally only be called once.

-- https://docs.python.org/3/library/asyncio-task.html

I think the alternative here would be the totally user-friendly asyncio.get_event_loop().run_until_complete(coro())

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was a bit worried about that

@oliver-sanders
Copy link
Member

Should cylc clean ignore the runN directory or will that come with #4285 (or future work)?

$ cylc install two
$ cylc clean two
...
contains the following workflow(s):
    - two/run1
    - two/runN

@oliver-sanders oliver-sanders modified the milestones: cylc-8.0b2, cylc-8.0b3 Jul 28, 2021
@MetRonnie MetRonnie marked this pull request as draft July 28, 2021 13:27
@MetRonnie
Copy link
Member Author

MetRonnie commented Jul 28, 2021

Marking as draft while investigating the runN symlink being detected as a separate workflow.

(As discussed on Teams) cylc clean will not use the inferred run number functionality. Edit: however, if running cylc clean foo and foo only has 1 run, I expect we should clean foo/run1 and remove the foo directory too

- Increase scan depth for cylc clean when it checks that the dir doesn't contain other workflows
- Try to safen up use of asyncio
@MetRonnie MetRonnie marked this pull request as ready for review August 4, 2021 16:19
cylc/flow/pathutil.py Show resolved Hide resolved
f"{bullet}{bullet.join(contained_workflows)}"
)
if not opts.force:
raise WorkflowFilesError(f"Cannot clean - {msg}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth adding a note to the end of the message to the effect that the user can use --force if that is what they genuinely want to do...

Copy link
Member

@wxtim wxtim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Scan read changes since last time.
  • Played with result branch, run some examples.
  • Had an in-depth play with test_init_clean__multiple_runs. Assured myself that Codecov is wrong in stating some logic not tested.
  • Had an in-depth play with test_remove_empty_parents

@MetRonnie
Copy link
Member Author

Merging as 2 approvals

@MetRonnie MetRonnie merged commit 4686ba6 into cylc:master Aug 12, 2021
@MetRonnie MetRonnie deleted the cylc-clean-base-dirs branch August 12, 2021 12:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

cylc clean and numbered/named run dirs
3 participants