Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposals for the test system #39

Open
cpockrandt opened this issue Jun 13, 2018 · 3 comments
Open

Proposals for the test system #39

cpockrandt opened this issue Jun 13, 2018 · 3 comments

Comments

@cpockrandt
Copy link
Collaborator

cpockrandt commented Jun 13, 2018

I would propose a few changes to the test system:

  1. Building tests creates new files in the repository.

    Since building a *.cpp.cmake file results in copying the file to *.cpp and inserting the types, it adds these files to the repo which doesn't seem to be a good style. Also changes to bits.hpp, uint128.hpp and structure_tree.hpp whenever including the library, make it confusing when working on library and committing code (but this is a separate issue). This could be fixed easily by only having one test suite (see my next point) and moving the types from *.typedef into *.cpp which would make *.cpp.cmake files obsolete.
    Here is a short article on separating build directories from the source folder.

  2. There are currently two test suites (test-sdsl and test-full-sdsl)

    I don't see a good reason for having a sparse test set. Changes in the library will in most cases be covered by a single typed test case (for debugging and local testing) and when a pull request is opened, the full test set should be run by CI anyway. I would propose to merge all types (and input files) into one test suite, include ccache for caching builds on TravisCI (which would reduce the build time further) and if necessary, ask TravisCI for an extension of the max. build time (last time we checked they did it for free for open source projects). We would then also include the SDSL in our Jenkins CI (for nightly builds and pull requests) which does not have a time limit.

  3. Building and execution of tests is currently done in an interleaved fashion

    When changes in the code somewhere results in a compiler error, it probably should be fixed first before one might start debugging runtime errors elsewhere.

  4. One test execution failing prevents building (and executing the rest of the test suite)

    Even if the tests wouldn't be interleaved anymore, it would be great if all tests were built and executed (even if some of them failed) to have a summary in the end which test cases are broken (in terms of compilation and runtime erorrs).

What are your views on this? I can prepare a pull request for this if you're interested/agree with these points. :-)

@mpetri
Copy link

mpetri commented Jun 16, 2018

I more than happy to looking at remodeling the tests. I agree there is lots of stuff not quite right. Some points

  1. The two test suits are there because TravisCI didn't support larger builds when we set this up a long time ago (or at least I couldn't see it). I think the full suit can take 2+ hours to run.

  2. We also build all the benchmarks and tutorial files as "compile" checks.

  3. I know a bunch of tools (e.g. firefox https://arewefastyet.com/ ) has these runtime benchmarks to make sure speed does not deteriorate when making changes. I always wanted to implement something like that. Maybe instead of speed looking at construction peak memory usage and file size as a sanity check?

  4. Would it be best to incorporate this in the cmake ctest framework?

  5. Some of the files needed for testing are currently hosted at KIT so I'm not sure how long they are still active. We might need to find a new host for those.

@cpockrandt
Copy link
Collaborator Author

cpockrandt commented Jun 25, 2018

  1. Currently every test is built twice (with and without in-memory). I think it should be sufficient to have one single test for in-memory functions. Without building and executing the in-memory tests, the entire test suite (which is the union of the small and large test suite) takes about 50 minutes on Travis (and we have a 2 hour time limit), so it should be sufficient for adding examples and tutorials to the tests, as well as an in-memory test.

    EDIT: 50 minutes for all tests (with and without in-memory)

  2. I don't understand the difference between tutorials and examples. Both folders each contain binaries that simply build a predefined vector and do some simple stuff like rank-queries, as well as more complex binaries with a command-line interface for building or searching indices. Maybe these two folders can be merged? Or is there a distinction? I'd also clean it up a bit and rename all those expl-*.cpp files if there are no objections.

  3. Benchmark tests would be nice, I agree (but I'd postpone it for now since this will already become quite a big pull request)

  4. Yes, I've done that. The test suite can also run in parallel now (there were sometimes clashes writing to files when executing tests in parallel).

  5. I guess we could host them on our servers if necessary. There are currently three files (faust, zarathustra and moby (int)). I think either faust or zarathustra would be sufficient for byte alphabets. faust and moby.int would be less than 500 kb in total (compressed). We could also store them in the repository even though this wouldn't be good style.

I'd also remove dependencies to libdivsufsort. We're currently adopting the code of libdivsufsort to a header-only-friendly format so that the code can be part of the sdsl without any pre-compiling, linking, etc. (which should be fine since it is published under the MIT license). Then we also wouldn't need any install scripts. Installing the header files could be done by cmake itself.

@mpetri
Copy link

mpetri commented Jul 6, 2018

  1. The tutorial is part of a slide deck linked in the README

  2. header only libdivsufsort would be nice. the best parallel SA construction algorithm is also divsufsort based so once the SA construction algorithm interface is clear those should be easily interchangeable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants