-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Quality improvements and best current practices #3
Quality improvements and best current practices #3
Conversation
Our RAUC install will not change the partition layout during install. Thus there is no need to check the size of the same partitions in both slots.
Using `lsblk --json` we can directly parse the tools output and must not rely on parsing shell output.
Using `findmnt --json` we can directly parse the tools output. This is considered better practice than parsing the shell output of `df`.
The previous test very bluntly only tested if the I2C-Subsystem in the kernel has been loaded at all. But this does not tell us anything about the actual devices on the bus we want to use. The new tests instead try to use the I2C devices the way userspace would. Actually none of the I2C devices are directly used by userspace: * The EEPROMs are read by the bootloader and their information are added to the devicetree's chosen_node. But for the EEPROMs we can at least do a read, so we can check if this works. * The PMIC is used inside the kernel. So we will only check the hardware has been probed. * The USB-Hub is also used inside the kernel So we will only check the hardware has been probed.
The previous test very bluntly only tested if the SPI-subsystem in the kernel has been loaded at all. But this does not tell us anything about the actual devices on the bus we want to use. The new tests instead try to use the SPI devices the way userspace would. * The Ethernet switch is managed by DSA. By successfully reading the statistics we can assume that the SPI communication to the Ethernet Switch works. * The ADC is managed by IIO. By successfully reading a non-zero ADC reading we can assume that we are reading actual values from the ADC. * The LCD is managed by drm. We only check if the correct driver has been probed.
Without the mmc the system would not boot in the first place. Additionally we have tests in `test_filesystems.py` that check if the partitions on the mmc have been probed correctly. If these succeed we can assume that mmc-subsystem was loaded correctly.
This is just another representation of the configuration EEPROMs present on i2c-0 and i2c2. Both are not used by userspace directly. So there is no need to check if these nodes in `/sys` exist.
Until now we only tested, if the endpoint provided a somewhat useful value. By also comparing the value to the output of `sensors` we can check the complete `tacd` stack.
The original test case contained a lot of code to handle different format strings. But it's unclear why that all that complexity was actually needed. This change removes all the complex format handling and reduces it to a bare minimum for this DUT.
In some places we have been using `stdout,_ rc = shell.run()` followed by `assert rc == 0`. The better way to do this is by using `stdout = shell.run_check()` since this removes a bit of boilerplate in this places.
By disconnecting the USB drive during the test we can be sure that Linux has dropped all buffers for the USB storage. And thus that this test not only tests buffers somewhere in the kernel without actually writing data to the USB drive. Since we already have the _eet_ connected to the TAC we can simply use it to disconnect and connect the USB-stick during the test.
This test makes sure the third state (On, Off, OffFloating) of the TACs power switch works electrically. This is especially important since this state is not exposed on the web interface (or rather only in the API). With this test we make sure that the load is actually engaged, when the power switch is "Off" - and that we can also switch it to "OffFloating".
If we do not overwrite the parent's __attrs_post_init__(), we do not need to call it.
d53f9d2
to
34bf2d1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall. Here are some minor suggestions:
34bf2d1
to
95c307f
Compare
Running commands in the shell in the background is merely a hack to have commands run in parallel: * You have to take care of `stdout` and `stderr` not messing up output of other programs. * You have to track the `pid` yourself. Using `systemd-run` solves these problems. This commit also introduces a context manager that wraps around `systemd-run` and makes test cases very readable.
Inferring the block device from the USB path at runtime is not needed, since the path in `/dev/disk/by-path/` will be stable for a given USB storage device in a given USB port. This way we get rid of a few extra commands. But more important: We get rid of pipes in shell commands, that may fail without failing the whole command.
The fixture `prepare_network` tears down the whole network setup of the TAC and replaces it with it's own setup. At the end of the test the process is reversed. But to make sure that the TAC is actually online we should at least wait until we have reached the same conditions as in the `network` state of the strategy. This way, if following tests rely on the TAC being online we can still fulfill that.
Using `try .. finally` we can make sure that we try to clean up after our test cases run - even if they fail.
Otherwise this line would only be a no-op.
Until now we have tested the size of the file systems in `slot 0` and `slot 1`. This seems natural, since these are two independent systems. But both slots basically contain the same system - only installed via two different methods. We currently have no reason to believe that the system installed via RAUC may have another file system layout that the system installed via fastboot. Thus we rewrite the tests to only test the file system sizes in `slot 0` and remove a good amount of complexity and testing time. If we ever encounter problems in this realm we can still write tests that focus on the actual problems fixed.
The strategy already checks, whether the system is running. Thus, it makes total sense to also collect debug information if this state could not be reached. This is a preparation for the removal of RAUC handling from the strategy. After support for RAUC has been removed from the strategy, the strategy will always assume, that we are in `system0`. Thus it does not make sense to have tests for `system1` anymore.
Until now we handling of RAUC installs and handling of state transitions between these two systems was done by the strategy. This led to a fair amount of added complexity in the strategy. For example did some states pretend to be reached (shell, network), even if actually a we were in one of the RAUC-states (`system0`, `system1`). Reducing the complexity in the strategy makes it also clear that other special cases, like reconfiguring network from `online` to `loopback` should not be part of the strategy. This commit removes handling of RAUC installation and tracking of the active slot from the strategy. As a replacement it adds pytest `fixtures` that allow test cases to manipulate which slot should be booted. The fixture `set_bootstate_in_bootloader` should be used in tests that want to boot into `system1` or change retries, `mark-good` etc. This fixture makes sure (using the `default_bootstate` fixture) that the system will always be in `system0` and has the default bootste after the test has finished. The only requirements for tests is, that they should not install a new system into `system0`, nor make `system0` unusable. While doing all these changes the actual RAUC tests have been adapted to more closely follow the best current practices for these tests.
Both states `shell` and `network` are basically the same: * `shell`: The system is up in userspace and all services have started. * `network`: Like `shell`, but we have checked that we are online. To ease the transitions between `network` and `shell` there is even code in place that acts like both states are the same. This commit now removes the additional `network` state and adds the online check to the `shell` state.
Unpacking a single element list using the following notation has the benefit, that it will raise an exception, if the list to unpack is not exactly one element long: > [stdout] = shell.run_check("some_command") This is especially useful for commands that are expected to only return a single line of output. Co-authored-by: Bastian Krause <bst@pengutronix.de>
a11baf4
to
35990f5
Compare
@Bastian-Krause From my side this is ready to merge. Feel free to have another look. |
Without this change all tests left the *eet* in what ever state happend to be the last. This does not work well with the notion that tests should leave the system in it's initial state. With this change the new `eet`-fixture will make sure that the *eet* is returned to "no connection" at tear down.
With just `0.2 s` of delay the measurement of `v1/dut/feedback/current` did not reliably succeed. It seems the test-setup simply needs a short moment until there is actually current flowing that can be measured.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work, thanks for sticking with it! The test suite is now in a much better and more reliable shape.
This PR adds lots of suggestions done by @Bastian-Krause aiming at improving the overall quality of the test suite.