Make failing tests possible. #121

kaeluka · 2015-04-15T23:08:33Z

Until now it has been possible to test only that something succeeds
(specifically, that the output matches exactly). This limited the
tests to cases where we had determinism and where the expected output
was exactly known.

This commit extends test.sh to maintain the old functionality, but now
ALSO allows checking scripts to be used instead of the old .out files.

A checking script is an executable whose name is that of a test, but
with the ending .chk -- if the test source if called foo.enc, the
checking script is called foo.chk and it is executable.

The make test command will now compile foo.enc and try to run it
(if compilation succeeds). The output of both compilation and
execution will be piped into the checking script. The checking script
can analyse the input and return successfully to signal that
everything is ok, or with a failure code to signal an error.

An example:

File fail.enc:

class Main
  def main() : void
    repeat i <- 100 {
      print x -- error should occur here
    }

File fail.chk:

#!/usr/bin/env bash
stdin=$(cat)

# Search in stdin for a line that contains "line 4", the error line.
#
# If the line is not found, grep returns with failure code,
# calling the expression after '||', making the test fail:
echo "$stdin" | grep "line 4" || exit 1

# To make doubly sure, we also look for the specific message:
echo "$stdin" | grep "Unbound variable 'x'" || exit 1

Until now it has been possible to test only that something succeeds (specifically, that the output matches exactly). This limited the tests to cases where we had determinism and where the expected output was exactly known. This commit extends test.sh to maintain the old functionality, but now ALSO allows checking scripts to be used instead of the old `.out` files. A checking script is an executable whose name is that of a test, but with the ending `.chk` -- if the test source if called `foo.enc`, the checking script is called `foo.chk` and it is executable. The `make test` command will now compile `foo.enc` and try to run it (if compilation succeeds). The output of both compilation and execution will be piped into the checking script. The checking script can analyse the input and return successfully to signal that everything is ok, or with a failure code to signal an error. An example: File fail.enc: class Main def main() : void repeat i <- 100 { print x -- error should occur here } File fail.chk: #!/usr/bin/env bash stdin=$(cat) # Search in stdin for a line that contains "line 4", the error line. # # If the line is not found, grep returns with failure code, # calling the expression after '||', making the test fail: echo "$stdin" | grep "line 4" || exit 1 # To make doubly sure, we also look for the specific message: echo "$stdin" | grep "Unbound variable 'x'" || exit 1

supercooldave · 2015-04-16T07:27:10Z

This is a great first step, but I think it could be made easier for the writer of the tests. I for one would rather not write a bash script every time I need to do a test.

Also, it is not clear how to specify simply that the test fails, without having to match on some error message.

This is probably fine for now. But a future version would perhaps allow the tester to write a small spec outside of bash that says something like (and I'm making this up now)

success: some output  
warning: some compiler warning

or

fail: some compiler error

One possibility is to write these as a small Ruby DSL.

Anyway, these are just ideas for now.

kaeluka · 2015-04-16T07:33:28Z

@supercooldave: to simply specify that the test fails is to look for an error message. If there's no message, the test hasn't failed. But to simply specify that it fails, no matter how is dangerous: if we break a feature that your test uses it will now fail forever, not testing what you wanted to test any more.

The tests would be more readable using a small library of pre-implemented functions, then you could write:

echo "$stdin" | fails_in_line 4
echo "$stdin" | fails_with "Unbound variable 'x'"

(or similar)
This would also make the tests easier to maintain after, for instance, the output format changes.

supercooldave · 2015-04-16T07:38:32Z

Even better, drop the echo $stdin | part and build that into the fails_ definition.

One problem with equating error messages with failure is that warning messages are also emitted.

kaeluka closed this Apr 16, 2015

kaeluka mentioned this pull request Apr 16, 2015

Make failing tests possible, improve test output. #123

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make failing tests possible. #121

Make failing tests possible. #121

kaeluka commented Apr 15, 2015

supercooldave commented Apr 16, 2015

kaeluka commented Apr 16, 2015

supercooldave commented Apr 16, 2015

Make failing tests possible. #121

Make failing tests possible. #121

Conversation

kaeluka commented Apr 15, 2015

supercooldave commented Apr 16, 2015

kaeluka commented Apr 16, 2015

supercooldave commented Apr 16, 2015