-
Notifications
You must be signed in to change notification settings - Fork 200
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixtures: Add argument use_subprocess
to run_cli_command
#5846
Fixtures: Add argument use_subprocess
to run_cli_command
#5846
Conversation
If you use this in the node deletion test you mentioned in #5842, does that test now fail? |
cccbeec
to
429fae3
Compare
Thanks a lot @sphuber for the detailed investigation and the fix! One question before I go through this:
I understand that running the cli tests in a subprocess is the safest way to catch any "oddities" that may be overlooked due to running through the cli runner. I think it was the right strategy to proceed here, and I understand it did catch some unrelated bugs as well. Now that we know how costly this operation is, I wonder: for the bugs related to changes not being persisted in the database, would there also be a "cheaper" way to detect them? |
I might not understand properly what you are trying to suggest here, but I don't think this would work. When you rollback the session you are undoing all the changes. How would that help detect the problem? Just to make sure the problem is clear, if we take the three A's of testing (arrange, act, assert), when we "act" with the runner, the assert is done on using the same session, and so it will work, regardless of whether the changes have been persisted (flushed and committed) to the database or not. Now you could force the test runner to flush and commit the session after it "acts" and so before the "assert" runs, but this would be hiding the very problem we faced just now where the code does not persist itself properly. |
429fae3
to
9ffaf41
Compare
ee5d4af
to
baeeea9
Compare
f403864
to
c6179f5
Compare
@ltalirz This is now ready for review. I just have to take a final look if some of the commands that use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @sphuber , this looks good to me.
nice to see the (test) bugs that were caught by this investigation!
just minor comments, no further review required
tests/conftest.py
Outdated
return self.stderr_bytes.decode('utf-8', 'replace').replace('\r\n', '\n') | ||
|
||
@property | ||
def output_lines(self) -> list[str]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
output_lines
only returns stdout, while output
returns stdout+stderr, which is likely to confuse users of the class.
I'd suggest reusing .output
in .output_lines
:param use_subprocess: Boolean, if ``True``, runs the command in a subprocess, otherwise it is run in the same | ||
interpreter using :class:`click.testing.CliRunner`. The advantage of running in a subprocess is that it | ||
simulates exactly what a user would invoke through the CLI. The test runner provided by ``click`` invokes | ||
commands in a way that is not always a 100% analogous to an actual CLI call and so tests may not cover the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it may be worth giving the actual example here that triggered this need, i.e. that an ORM object may be in the session but not committed to the database, causing existence checks to be ineffective
tested, then a subprocess cannot be used, since the monkeypatch only applies to the current interpreter. In | ||
these cases it is necessary to set ``use_subprocesses = False``. | ||
:param suppress_warnings: Boolean, if ``True``, the ``PYTHONWARNINGS`` environment variable is set to | ||
``ignore::Warning``. This is important when testing a command using ``use_subprocess = True`` and the check |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reading this, it seems to me that, when running via a subprocess, we may want to suppress warnings by default (e.g. it could be that some random new deprecation warning starts popping up that breaks tests where people forgot to set suppress_warnings=True
).
On the other hand, you probably don't want to change the default behavior for the regular tests.
Fine to keep; I don't have a better solution
tests/conftest.py
Outdated
command_path = cli_command_map[command] | ||
args = command_path[:1] + ['-p', aiida_profile.name] + command_path[1:] + parameters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might make sense to move this into the run_cli_command_subprocess
function (and add the aiida profile name as a function argument), so it becomes more parallel to run_cli_command_runner
The `verdi data` subcommands, which are pluginnable through the `aiida.cmdline.data` entry point group, get their actual name from the corresponding entry point name, not the name that is declared in the code. However, the `cli_command_map` fixture gets the name from the code and not the entry point name. A mismatch in these names will cause the `run_cli_command` fixture to fail in the case of `use_subprocess=True`, so therefore we correct the command names to match the entry point.
By default the `run_cli_command` fixture will run the provided command through the `CliRunner` utility provided by `click`. However, this suffers from a problem as the code pathway followed through the runner is not identical as when the command is actually called through the CLI. The reason is that in `click` commands do not have a reference to their parents, and so when a subcommand is invoked through the runner, only the code of the subcommand is executed. Any code that is part of its parents, such as parameter validation and callbacks are skipped. In addition, the runner is executed in the same interpreter as where the test suite is being run, whereas a command invoked through the actual CLI runs in a standalone interpreter. This may hide subtle bugs. Therefore, the `use_subprocess` argument is added, which is `False` by default, which when set to `True`, the command will not be called through the runner, but as an actual subprocess using `subprocess.run`. This guarantees that the tested pathway is identical to what an actual invocation through the command line would be. It is not enabled by default since running through a subprocess is slower and not always necessary.
Certain command line tests cannot be run with `use_subprocess=True`. Examples are tests that use the `monkeypatch` fixture to patch certain resources. If the command were to be executed in a subprocess, the patch would no longer apply since it is only active in the current interpreter. For those tests, the `use_subprocess` keyword of `run_cli_command` needs to be set to `False`. Also test that use the `empty_config` or `config_with_profile_factory` fixtures need to run with `use_subprocess=False` since these fixtures generate a temporary config that would not be used by the subprocess since that uses the config at the `AIIDA_PATH` environment variable. Then there are tests whose assertions very strictly check the output. If the command is executed through a subprocess, any deprecation warnings that are hit will be printed and the test might fail. This is not a problem when running through the test runner because then the deprecation warning will already have been emitted when the test suite loaded and won't be fired another time when the test code is executed. For this purpose the `suppress_warnings` argument is added to the `run_cli_command` fixture. False by default, but when set to `True` the `PYTHONWARNINGS` environment variable is set to `ignore::Warning` which should cause all warnings to be suppressed.
The addition of `use_subprocess=True` to the `run_cli_command` fixture to actually run the command as a subprocess instead of through the test runner in the same interpreter, revealed a number of small bugs in the tests themselves that would cause them to fail. They are fixed such that the tests pass regardless of the value of `use_subprocess`.
The `run_cli_command` fixture was updated a few commits ago to allow running the command as a subprocess instead of through the test runner. The reason is that the test runner does not execute the exact same pathway as when run as a subprocess and this was hiding subtle bugs. Specifically, a bug in the `PsqlDosBackend` where changes made by a CLI command were not automatically persisted to the database, was not caught by the tests since when running in the test runner the assertions do not explicitly check that changes are persisted, just that they are present, even if just in memory. Ideally, all tests are run with `use_subprocess=True`, which was set as the default originally, but unfortunately this is quite a deal slower than running with the test runner and would make the test suite almost 30 minutes slower. Instead, the tests that failed because of the storage backend bug are now explicitly marked as using `use_subprocess=True` and the default is changed. This now guarantees that at least these tests that are known to depend on the subprocess pathway are properly run in that way. This should provide greater coverage to prevent these types of bugs in the future. An exception is made for the tests of `verdi process list` and a number of subcommands of `verdi data` (mostly the list command but also some of the import and export commands. The reason is that all of the commands failed for the same reason: the group that was created in the setup of the test, was not persisted and so not found when the command was run in a subprocess. Since these tests are numerous and take multiple minutes to run, and the same behavior is already explicitly tested in the `verdi group` tests, they are run with `use_subprocess=False` to prevent the test suite from taking too much time.
c6179f5
to
2f92964
Compare
Fixes #5843
Release
v2.2.0
was broken in a bad way because of a bug that was introduced in thePsqlDosBackend
that was not caught by any of the tests. The problem is that changes made byverdi
commands were no longer automatically committed and persisted to the database.This went unnoticed by the tests because in the unit tests the commands are run through the
click.testing.CliRunner
utility. This suffers from a number of problems. We already knew that it only executed the (sub)command that was passed to the runner, and any code that was part of the parents would not be executed. Therun_cli_command
already had to deal with and, for example, manually added the verbosity option to each command, which normally is done automatically by theverdi
top-level command when invoking through the command line. Thev2.2.0
release highlighted another problem, namely that the assertions of the tests do so against the state of the interpreter of the test, which in this case would contain the correct changes after the command was executed, but this did of course not require the changes to be persisted and so didn't notice the problem.To avoid these problems of the tests not executing the exact pathway as a real invocation over the command line, the
run_cli_command
fixture was updated with the keyworduse_subprocess
which when set toTrue
would invoke the command as an actual subprocess using thesubprocess
module. This would force the command to be tested as in a real-life situation.Running the tests with this mode caused a number of tests to fail. A number of these were due to unrelated bugs that were hidden through the use of the
CliRunner
, but 32 tests failed due to the bug in thePsqlDosBackend
. Had these tests been run withuse_subprocess=True
before the release, we would have caught the bug before shipping it withv2.2.0
.Now ideally then we would always run all tests with
use_subprocess=True
but this is quite a bit slower and would add another 30 minutes roughly to the unit test suite, which is unaccepatble. Therefore, I have gone through a procedure to determine exactly those tests that fail because of the session bug and run only those withuse_subprocess=True
. This should give us a lot more confidence that we will catch similar bugs in future PRs.The PR consists of the following commits (see commit messages for more details):
run_cli_command
fixture to add theuse_subprocess
and setTrue
as the defaultuse_subprocess=False
that use monkeypatching or other strategies to change things for the test that are only visible in the current interpreter. They would fail when using the subprocess since that uses a differnt interpreter where the monkeypatch is not applied. Similarly, certain tests are run withsuppress_warnings=True
since their assertions strictly check output and the printing of deprecation warnings would cause them to fail. This wasn't a problem with the test runner since in that case the warning would already have been printed on test startup and wouldn't be printed again.PsqlDosBackend
bug but were related to how defaults are processed by theCliRunner
and when running as a subprocess.use_subprocess=True
pathway tries to determine the full command line arguments from theclick.Command
which failed for theverdi data
comands as their the actual name is not taken from the command definition but the entry point with which they are registered.use_subprocess
toFalse
but setuse_subprocess=True
for all 32 tests that failed because of thePsqlDosBackend
session bugNote that when these 5 commits are run on top of
v2.2.0
there are 32 tests that fail because of the session bug (and 2 that hang for the same reason). This branch, however, puts these commits on top ofmain
where the session bug has already been reverted and so should pass. The branch of the broken tests can be found here: https://github.com/sphuber/aiida-core/tree/feature/5843/run-cli-command-subprocess-broken