-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
R4R: Multi-seed parallel simulation #2313
Conversation
While we're at it, can we also get make test_gaia_sim_fast to use different seeds each time as well? This would mean the randomized testing can be covering a wider surface. (Procedure on a bug find would be to test on develop as well) |
Maybe. I like having deterministic failures, it makes CI easier to work with. Definitely in favor of running lots of random seeds before any release. |
Seems some of the Amino serialization code is not goroutine-safe, or we're using singleton codes which aren't goroutine-safe. Presently running multiple |
Not sure why we need this in go, running multiple in parallel within bash works fine on my machine. |
Immediately, probably not, but I could see it being advantageous in the future - running multiple instances of different Gaia apps during version upgrades, or maybe multiple "Apps" at once for a very fast IBC relay program. |
I think we can punt figuring out how to test multiple chains together safely to postlaunch. I think we should go with parallel calls within CI / bash for now. Testing on my system as well, your analysis seems right, its probably a singleton map instance somewhere. |
Codecov Report
@@ Coverage Diff @@
## develop #2313 +/- ##
========================================
Coverage 61.53% 61.53%
========================================
Files 122 122
Lines 7472 7472
========================================
Hits 4598 4598
Misses 2554 2554
Partials 320 320 |
OK, I'm not entirely sure where the issue is so it might not be an easy fix - I still think we should find out though and make an informed decision. Switched to a bash script for now. |
This should be run as the "24-hour simulation" before cutting a release, possibly with more seeds / a higher number of blocks - and on a large multi-core machine. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
utACK -- just left a minor comment. Side note, are we not afraid we'll miss out on some potential bug discoveries by not using random seeds or will the fast sim use random seeds?
scripts/multisim.sh
Outdated
echo "Running full Gaia simulation with seed $seed. This may take awhile!" | ||
file="$tmpdir/gaia-simulation-seed-$seed-date-$(date -Iseconds -u).stdout" | ||
echo "Writing stdout to $file..." | ||
go test ./cmd/gaia/app -run TestFullGaiaSimulation -SimulationEnabled=true -SimulationNumBlocks=1000 -SimulationVerbose=true -SimulationCommit=true -SimulationSeed=$seed -v -timeout 24h > $file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we break this command on multi-lines with \
to make it easier to read?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, done.
We should have a separate script which sequentially runs simulations with new random seeds and stops if it finds a failing seed - let's do that separately though - #2409. |
Targeted PR against correct branch (see CONTRIBUTING.md)
Linked to github-issue with discussion and accepted design OR link to spec that describes this work.
Wrote tests
Updated relevant documentation (
docs/
)Added entries in
PENDING.md
with issue #rereviewed
Files changed
in the github PR explorerRef #1924
For Admin Use: