Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a pytest-workflow generate command line function #193

Open
rhpvorderman opened this issue Mar 19, 2024 · 3 comments
Open

Add a pytest-workflow generate command line function #193

rhpvorderman opened this issue Mar 19, 2024 · 3 comments

Comments

@rhpvorderman
Copy link
Member

Command line invocation:

pytest-workflow-generate tests/test_bla.yml my_name command --flag --another-flag settings/settings.json myworkflow.format

First argument is the test file to generate. Second argument is the test name. All the other arguments are treated as the command argument.

Resulting yaml:

- name: my_name 
  command: "command --flag --another-flag settings/settings.json myworkflow.format"
  files:
    - path: relative/to/workflow/output/dir/my_file.txt
      md5sum: "abcdef0123456789"

Etc.

This is especially useful for generating all the file paths. stdout and stderr are omitted as these contain timestamps and dates.

This feature is inspired by nf-tests snapshot function.

@DavyCats, @Redmar-van-den-Berg what do you think of this?

@Redmar-van-den-Berg
Copy link
Collaborator

It would be neat to have a way to generate tests, or at least the file paths that are produced. However, I'm not a big fan of testing for the checksum of the output files, since it is impossible to tell what went wrong when it changes.

How about adding a "contains" for the first line of the file? That way you can also include a test for stderr and stdout.

@rhpvorderman
Copy link
Member Author

However, I'm not a big fan of testing for the checksum of the output files, since it is impossible to tell what went wrong when it changes.

I agree. However, it is trivial to delete the md5sums afterwards if you don't need them. If you want a bit-for-bit reproducible workflow it is quite useful that this work is already done.

So, this should be a CLI option? pytest-workflow-generate --md5sum will get you all the md5sums? That sounds like an excellent idea.

How about adding a "contains" for the first line of the file? That way you can also include a test for stderr and stdout.

That would be ##fileformat=vcf4.4 or something and for cutadapt: This is cutadapt 4.2. Not very informative, and also prone to breaking tests when program versions are upgraded. I like the idea of automating some of the tediousness out of creating contains tests, but it is really hard to come up with a good universal criterion.

@Redmar-van-den-Berg
Copy link
Collaborator

Ideally, we would only generate tests that we know will pass, but of course we cannot even know that the same files will be generated when you run command a second time. Although I think it is sensible to assume that the paths will be the same, and those are also the most annoying to type in manually.

Additionally, we can allow the user to specify if they want additional tests on the files, like --md5sum or --contains, --contains-regex etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants