Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: infra #183

Closed
wants to merge 1 commit into from
Closed

docs: infra #183

wants to merge 1 commit into from

Conversation

MCJOHN974
Copy link

Here I wrote some description of testing infrastructure we want to have. Now it looks like more as proclamation of goals and set of tasks (thus maybe it worth make it an issue rather than document file?), and set of tasks. As tasks will be completed the document will contain less tasks and more descriptions of how it works

@MCJOHN974 MCJOHN974 requested a review from aborg-dev January 8, 2024 13:20
@MCJOHN974
Copy link
Author

MCJOHN974 commented Jan 8, 2024

And I don't think it is a good idea to put us into the box and write detailed technical description before writing the code, better to describe result we want to achieve, and add technical details during the process

@MCJOHN974
Copy link
Author

If nobody have any notes\suggestions about this doc I will create issue duplicating todo points from here, and suggest to merge this doc, and later replace todo points to description of actual infrastructure.

@MCJOHN974 MCJOHN974 requested a review from mooori January 8, 2024 14:54
@aborg-dev
Copy link

If nobody have any notes\suggestions about this doc I will create issue duplicating todo points from here, and suggest to merge this doc, and later replace todo points to description of actual infrastructure.

Sorry for the slow review on my end, just got to this. Will review the content in full detail tomorrow morning.

Here I wrote some description of testing infrastructure we want to have. Now it looks like more as proclamation of goals and set of tasks (thus maybe it worth make it an issue rather than document file?), and set of tasks. As tasks will be completed the document will contain less tasks and more descriptions of how it works

Yes, we can start with a more flexible document (e.g. hackmd or Google Doc) to facilitate a discussion and then gradually convert it to GitHub document as we implement the components of the solution.

And I don't think it is a good idea to put us into the box and write detailed technical description before writing the code, better to describe result we want to achieve, and add technical details during the process

The intention is not to constrain the implementation, but rather to go through a mental exercise to identify what we want to build and how we plan to do it. This exercise is useful as it allows us to identify potential challenges early on without actually having to build anything in code (which would be much more expensive). It also yields an initial action plan that we can use to estimate how much time it will take to build the solution and identify any pieces that we can build in parallel. You are right that we will likely deviate from this plan due to discovering something unknown and that is fine, we still would likely save time in the end by selecting the most promising plan.
Design documents are a common practice in the industry, e.g. see this article [1] for examples.

[1] https://www.industrialempathy.com/posts/design-docs-at-google/

Copy link

@aborg-dev aborg-dev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a few comments that mainly aim to start the design discussion - I think it would be useful to iterate on them a bit before trying to come up with a fully revised version of this document, as there is a common theme between them.
As I've mentioned earlier, feel free to use Hackmd/Google docs for this stage if that works better.


Firstly, we aim to implement the following types of tests and benchmarks:
- Wasm spectest: Programs generated from Wasm spectest.
- Benchmarks: Programs we manually write in Wasm or Rust and compile to Wasm.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can keep benchmarks out of the scope of this proposal and delegate it to #182.

- Wasm spectest: Programs generated from Wasm spectest.
- Benchmarks: Programs we manually write in Wasm or Rust and compile to Wasm.

Both types of programs will undergo the same pipeline:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's true that so far, these two types of programs have been handled similarly, but going forward, this is likely to change to give a better experience with both of these tools.
For example, tests must be easy and fast to run and must present aggregated reports across many tests. Benchmarks, on the other hand, can take a long time to run, will be compiled with different optimization flags, and will likely become parameterized by the size of the input (e.g. N-th Fibonacci number with N = 1..10^6).

So in this document, I would focus specifically on our goals for the test suite and leave benchmarks out of scope.

Benchmarks also serve as global tests.

## Test Creation and Generation
We use a snapshot-based pattern for generating Wasm from spectest wast files. Benchmarks are added manually.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should re-consider this design decision and explore what our other options are:

  • [Current] Generate WAT files from WAST files and run them through WAT -> ZKASM pipeline
  • Generate ZKASM files directly from WAST files and don't commit WAT files to Git
  • Don't commit ZKASM files to Git and instead pass generated ZKASM code directly to JS test runner

We need to understand what are the pros and cons of each approach and see which one is the most appropriate for our goals.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created an issue where I suggest to discuss if we should get rid of snapshot testing: #188

Our goal is to simplify adding benchmarks (simply add a `.wat` file to the benchmarks directory, and it automatically goes through the pipeline) and to automatically pull all updates from Wasm spectests.

## Wasm Compiling
We aim to have a single test that compiles all `.wat` files to zkasm and writes results to a CSV file in a snapshot-based pattern. This script should be capable of:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by "single test" here? A single Rust test case? Or a single shell script to run tests?

- Compiling all files and checking if the results are the same as the committed files (used in CI).

## Running Compiled zkasm
We intend to implement a single command to run all zkasm and update the state table in a snapshot-based pattern. This script should be capable of:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, let's focus on tests here. We want a command that runs all tests, but not necessarily the (slow) benchmarks.

## Accepting PRs
Our CI checks that tests and benchmarks are correct, that committed zkasm files correspond to files generated by the compiler, and that state files contain actual data.

We plan to implement a tool that automatically checks for any degradation in state files, such as performance slowdowns or failing tests that previously passed.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's ignore the performance aspect for now and focus on correctness.

Right now, the review of changes to CSV snapshots is a manual process done by human reviewers. We want to move to a world where this part of the review is done automatically. Can we try to summarize here the rules that such a bot would use to better understand which information we need to track?

This bot should also handle a few exceptional situations:

  • Adding a new spec test or manual WAT test (with possibly broken tests in it)
  • Dealing with temporarily broken tests (e.g. the case we had during i32 migration or when we add a new spec test that has some sub-tests passing, but that subset changes as we implement the necessary features)
  • Removing a manual WAT test

@MCJOHN974
Copy link
Author

Moved doc to hackmd (please ping me if you can't view\edit).
Short summary of changes I did:

  • remove benchmarks out of the scope
  • add more detailed PR bot description
  • raised question about handwritten tests
  • raised question about snapshot tests (Should we use snapshots in tests #188, not covered in doc until decision)

@aborg-dev
Copy link

aborg-dev commented Jan 9, 2024

Moved doc to hackmd (please ping me if you can't view\edit).

Good stuff! I've added a few comments to hackmd document.

@MCJOHN974
Copy link
Author

Delegated to #215

@MCJOHN974 MCJOHN974 closed this Feb 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants