-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: infra #183
docs: infra #183
Conversation
And I don't think it is a good idea to put us into the box and write detailed technical description before writing the code, better to describe result we want to achieve, and add technical details during the process |
If nobody have any notes\suggestions about this doc I will create issue duplicating todo points from here, and suggest to merge this doc, and later replace todo points to description of actual infrastructure. |
Sorry for the slow review on my end, just got to this. Will review the content in full detail tomorrow morning.
Yes, we can start with a more flexible document (e.g. hackmd or Google Doc) to facilitate a discussion and then gradually convert it to GitHub document as we implement the components of the solution.
The intention is not to constrain the implementation, but rather to go through a mental exercise to identify what we want to build and how we plan to do it. This exercise is useful as it allows us to identify potential challenges early on without actually having to build anything in code (which would be much more expensive). It also yields an initial action plan that we can use to estimate how much time it will take to build the solution and identify any pieces that we can build in parallel. You are right that we will likely deviate from this plan due to discovering something unknown and that is fine, we still would likely save time in the end by selecting the most promising plan. [1] https://www.industrialempathy.com/posts/design-docs-at-google/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a few comments that mainly aim to start the design discussion - I think it would be useful to iterate on them a bit before trying to come up with a fully revised version of this document, as there is a common theme between them.
As I've mentioned earlier, feel free to use Hackmd/Google docs for this stage if that works better.
|
||
Firstly, we aim to implement the following types of tests and benchmarks: | ||
- Wasm spectest: Programs generated from Wasm spectest. | ||
- Benchmarks: Programs we manually write in Wasm or Rust and compile to Wasm. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can keep benchmarks out of the scope of this proposal and delegate it to #182.
- Wasm spectest: Programs generated from Wasm spectest. | ||
- Benchmarks: Programs we manually write in Wasm or Rust and compile to Wasm. | ||
|
||
Both types of programs will undergo the same pipeline: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's true that so far, these two types of programs have been handled similarly, but going forward, this is likely to change to give a better experience with both of these tools.
For example, tests must be easy and fast to run and must present aggregated reports across many tests. Benchmarks, on the other hand, can take a long time to run, will be compiled with different optimization flags, and will likely become parameterized by the size of the input (e.g. N-th Fibonacci number with N = 1..10^6).
So in this document, I would focus specifically on our goals for the test suite and leave benchmarks out of scope.
Benchmarks also serve as global tests. | ||
|
||
## Test Creation and Generation | ||
We use a snapshot-based pattern for generating Wasm from spectest wast files. Benchmarks are added manually. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should re-consider this design decision and explore what our other options are:
- [Current] Generate WAT files from WAST files and run them through WAT -> ZKASM pipeline
- Generate ZKASM files directly from WAST files and don't commit WAT files to Git
- Don't commit ZKASM files to Git and instead pass generated ZKASM code directly to JS test runner
We need to understand what are the pros and cons of each approach and see which one is the most appropriate for our goals.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I created an issue where I suggest to discuss if we should get rid of snapshot testing: #188
Our goal is to simplify adding benchmarks (simply add a `.wat` file to the benchmarks directory, and it automatically goes through the pipeline) and to automatically pull all updates from Wasm spectests. | ||
|
||
## Wasm Compiling | ||
We aim to have a single test that compiles all `.wat` files to zkasm and writes results to a CSV file in a snapshot-based pattern. This script should be capable of: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean by "single test" here? A single Rust test case? Or a single shell script to run tests?
- Compiling all files and checking if the results are the same as the committed files (used in CI). | ||
|
||
## Running Compiled zkasm | ||
We intend to implement a single command to run all zkasm and update the state table in a snapshot-based pattern. This script should be capable of: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, let's focus on tests here. We want a command that runs all tests, but not necessarily the (slow) benchmarks.
## Accepting PRs | ||
Our CI checks that tests and benchmarks are correct, that committed zkasm files correspond to files generated by the compiler, and that state files contain actual data. | ||
|
||
We plan to implement a tool that automatically checks for any degradation in state files, such as performance slowdowns or failing tests that previously passed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's ignore the performance aspect for now and focus on correctness.
Right now, the review of changes to CSV snapshots is a manual process done by human reviewers. We want to move to a world where this part of the review is done automatically. Can we try to summarize here the rules that such a bot would use to better understand which information we need to track?
This bot should also handle a few exceptional situations:
- Adding a new spec test or manual WAT test (with possibly broken tests in it)
- Dealing with temporarily broken tests (e.g. the case we had during i32 migration or when we add a new spec test that has some sub-tests passing, but that subset changes as we implement the necessary features)
- Removing a manual WAT test
Moved doc to hackmd (please ping me if you can't view\edit).
|
Good stuff! I've added a few comments to hackmd document. |
Delegated to #215 |
Here I wrote some description of testing infrastructure we want to have. Now it looks like more as proclamation of goals and set of tasks (thus maybe it worth make it an issue rather than document file?), and set of tasks. As tasks will be completed the document will contain less tasks and more descriptions of how it works