Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(docs): Add testing standards ADR #450

Merged
merged 2 commits into from
Jun 8, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -358,8 +358,9 @@ and the E2E tests you can get to work :tada: .

1. Make sure your commits are [signed](https://docs.github.com/en/authentication/managing-commit-signature-verification/signing-commits).

1. **All code contributions must be submitted with unit tests**. See [this package](https://github.com/weaveworks-liquidmetal/flintlock/tree/main/core/steps/microvm)
package for a good example of tests.
1. **All code contributions must be submitted with unit tests**. Read [this ADR](https://github.com/weaveworks-liquidmetal/flintlock/discussions/284)
for our guide on what counts as appropriate testing.
See [this package](https://github.com/weaveworks-liquidmetal/flintlock/tree/main/core/steps/microvm) package for a good example of tests.

1. For extra special bonus points, if you see any tests missing from the area you are
working on, please add them! It will be much appreciated :heart: .
Expand Down
263 changes: 263 additions & 0 deletions docs/adr/0284-testing-standards.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,263 @@
# Testing Practices and Standards

Status: approved
Date: 2021-11-23 (approved on 2022-05-23)
Authors: @Callisto13
Deciders: @Callisto13 @jmickey @richardcase @yitsushi

# Context

The codebase is moving from POC to a production alpha release. Part of getting
production ready is increasing confidence in and quality of the code via automated
tests. As it stands, a fair amount needs to be backfilled, and as the product
grows more will need to be added.
This proposal establishes some base rules and practices which all maintainers
and contributors agree to follow when writing tests.

# Scope

The scope is quite broad, and the topic one which inevitably invites Opinions.
This doc establishes a baseline of what testing elements would constitute an
acceptable PR submission.
This ADR is more capturing what we currently do and giving space to expand,
consider and alter, than me prescribing a new way of doing things.

Topics (include but not limited to):
- When to write tests
- Coverage
- Behaviour
- Mocking
- Refactoring
- Gomega
- Table tests
- Unit tests
- Integration tests
- End to end tests

# Decision

## When to write tests
Always. A PR which adds new or changes existing Go code must always come with
corresponding tests which cover all paths.

## Coverage

All PRs currently run `test-with-cov` which generates a coverage report.

Coverage is a useless metric (unless it is zero) and it is disturbingly easy
to get to 100% without actually testing anything at all.
Furthermore, our report is notoriously inaccurate (coverage appears to change on
pure docs PRs) although this has recently improved.

We have discussed the value of having this report at all, given few of us read
it and it is mostly noise. We did concede that it is useful to have _something_
on PRs purely to check that tests have been added, if for no other reason than
it saves the maintainers/reviewers having to read and nudge.

While again the coverage report should be taken with a bucket of salt, a
variation in the red of more than 5% should raise flags: the submitter should
review their both tests and code to discover gaps.

## True coverage

Generated coverage is not true coverage\*: it indicates that code paths have
been hit... and that's it. Automation will not know, for example, whether
anything is verified after a method call: it will not know whether something
is tested _well_.

\* _Lol yeh true coverage is not a "thing", as really it is subjective and
circumstantial. In this case I mean a shorthand for "coverage of what we
care about", which is; "does my stuff behave the way it should?"_

Tests should be reviewed as critically as any other code, and it is up to the
reviewer to determine whether coverage is sufficient.

Those who T/BDD will naturally get to 100% true coverage with very little
thought or effort.
Those who don't should actively try to break their code and tests after they
have written them.

### Testing behaviour, not lines

In order to get true coverage we test against behaviour rather than
implementation. In other words; we don't test that various internal pathways
have been traversed, we test that the correct set of outcomes have occurred
based on a given set of inputs.

Eg:
- If something is supposed to return an object, validate that the outcome values
are as expected given the inputs.
- If a method is supposed to take a path and write a file, check that a file
exists at the location.
- If a function is supposed to throw a catastrophic error if conditions are not
met, verify the error and message that is returned.

Testing against behaviour also has the benefit of influencing the design of the
code, making it easier to change (and test) in the future. \o/

If we notice that a given line is not covered in any of the test cases and we can see no reason
to add a case for it, we can reasonably ask if that line can be removed. Why is it there?
If it turns out that the line can't be removed, there has to be a reason why it is there,
therefore a behaviour exists which we can test.

### Separate packages

In order to facilitate testing against behaviour, and to ensure that each 'unit'
is actually that (a neatly encapsulated package of code which fulfils a
contract with no sprawling hidden interdependencies), all tests are created in
a separate `_test` package.

### Black-box vs White-box

On a similar theme to the above, we are strongly opposed to black-box testing; none
should be added, in any circumstances. If you are tempted, resist. It always causes pain at the end.

## Mocking

We currently use [gomock](https://github.com/golang/mock) to generate interface mocks,
for our own internal interfaces.

BUT gomock is really crap, so we are making a conscious effort to move
over to [counterfeiter](https://github.com/maxbrunsfeld/counterfeiter).

We shouldn't mock what we don't own. Instead we should create a light shim
around the external component; we integration test against the real thing and
unit test against the mocks of our delegating wrapper.

We should be conscious of overmocking: we should use mocks sparingly when calling
other internal interfaces which are not instrumental to the thing we are testing.
If the interface has a profound effect on the test scenario, we should favour
fakes, stubs and (more sparingly still) the real thing.

## Refactoring

Refactoring tests should be done with a lighter hand than one would refactor
the actual code. While it may make sense to DRY out some tests, if it comes
at a heavy expense of readability or accessibility, then it is not worth it.

## Ginkgo/Gomega

We use the standard Go `testing` package for tests.
We also use [Gomega](https://onsi.github.io/gomega/) matchers to make assertions.

Eg:

```go
Expect(err).NotTo(HaveOccurred()) // note: NotTo is preferred over ToNot
Expect(err).To(HaveOccurred())
Expect(err).To(MatchError("o noes"))

Expect(result.Value).To(Equal(expectedValue))
Expect(outcome).To(BeTrue())

Eventually(something(), "10s").Should(Succeed())
```

See the docs linked above for more examples.

Currently we do not use [Ginkgo](https://onsi.github.io/ginkgo/).

## Table tests

Table tests are simultaneously very useful and the devil's work.

When used incorrectly they can lead to hard-to-read, hard-to-extend, complex
and increasingly brittle tests. A good rule of thumb is: if your test has the
same amount of or more logic than your actual code, then table tests are off
the table (heh heh).

Another one is: is this table being added because it
makes sense to group a bunch of identical tests together, or because the tests
don't feel DRY enough? As noted above, we prioritise readability and stability
over refactoring in tests.

Separate test functions are preferable over complex tables.

## Tests as documentation

A bonus feature of tests is that (when done correctly) they can serve as low
level documentation on how a package/interface/unit behaves.

This makes them a good starting point for new people coming to the codebase.
Human readable tests are also just nicer and easier to maintain.

All test functions must be clearly named, eg:
```go
// bad
func TestMyFunc_1(t *testing.T) {
func TestMyFunc_2(t *testing.T) {

// good
func TestMyFunc_happyPath_returnsFooBar(t *testing.T) {
func TestMyFunc_badInputX_returnsSomeError(t *testing.T) {
```

In table tests, the cases struct should have a `"message"` or `"name"` field,
eg:
```go
// bad
...
{
val1: 1478,
val2: 1834,
expected: "nooneexpectsthespanishinquisition"
},
// ¯\_(ツ)_/¯
// what is going on here? do i have to scroll and read the test code like
// some kind of sucker?
...
// good
...
{
name: "when 'kittens' and 'puppies' expect 'cuteness'",
val1: "kitten",
val2: "puppy",
expected: "wowsocute"
},
...
```

## Test separation

Tests are grouped together, or separated, based on purpose.
For example, each `foo.go` unit of code should be accompanied by
a corresponding `foo_test.go` unit test file.

`foo.go` may not require an integration test or an e2e test, but if any behaviour
in `foo.go` _does_ require those, then they will live in a separate location away from the
unit. Eg. our e2es currently live in `test/e2e/`.

## Unit tests

All code changes must come with a corresponding unit test addition or change.

Units should aim only to test their corresponding unit. While sometimes the
lesser evil is to use real package_Y in package_Z_test, there is nothing
more annoying than making a change in package_X, or package_W, and seeing that
test fail inexplicably.

## Integration tests

Integration tests should be added to test a thin layer of interactions with 3rd
party services. Ours include Firecracker, Containerd and Netlink, as well as any
IO operations.

Not all code changes will require an integration test addition, but changes to
or more of those 3rd party components will.

## End to end tests

E2es mimic the top-level user experience of the product. They are happy-path
CRUD (technically CRD) only and do not check finer variances of behaviour.

Not all code changes will require an E2E addition, but they should be when there
is a change to the end user experience.

# Consequences

* All new code is merged with equivalent new tests covering all new behaviour
* Tests are human readable, accessible and easily extended
* Happy and unhappy behaviours are covered
* Code design co-incidentally improves
* Coverage should always increase (the number is irrelevant, so long as it is
not zero)