Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

System tests to cover chain upgraded #1480

Closed
2 of 3 tasks
alpe opened this issue Jul 4, 2023 · 8 comments
Closed
2 of 3 tasks

System tests to cover chain upgraded #1480

alpe opened this issue Jul 4, 2023 · 8 comments

Comments

@alpe
Copy link
Contributor

alpe commented Jul 4, 2023

Our current system tests or integration tests do not cover migration of legacy state to the current version. This is a blind spot that was covered by community testnets and their feedback.
Unfortunately that had become a bottleneck that we need to automate and own in the project.

The system tests setup a multi node environment easily that can be used for this, now.

Preconditions:

  • published artifacts of the previous version(s)
  • a contract for smoke tests that has data persisted, supports a smart query and can trigger a message
  • a rich state dump that seed the chain (ideally coming from some production environment)

The chain upgrade test would then:

  • start a multi node chain with an older version and the seed state from the dump
  • submit an upgrade proposal to the current version and vote yes (voting period must be reduced/ or instant proposal type in SDK 50)
  • halt the chain when upgrade height is reached
  • replace the binary used with the current version
  • do some smoke tests to ensure system is still healthy
@Reecepbcups
Copy link

@alpe I'm happy to take this on if you all are okay with using Strangelove's interchain test?

https://github.com/strangelove-ventures/interchaintest

@alpe
Copy link
Contributor Author

alpe commented Jul 18, 2023

@Reecepbcups thanks a lot for volunteering! I would be happy to have more people looking into this!

We had some issues with running docker based tests on CircleCI before. Therefore I would like to avoid this and the complexity that comes with it. We can use our system test framework instead. It should be straight forward as it works with a compiled binary only that can easily be replaced now. There is no need for multiple chains or relayers for this kind of test.

First step would be publishing a wasmd linux binary on every release. Then we need a "good" state dump for wasmd with a contract that we know and that we can query from a test.

@Anmol1696
Copy link

I think there can be a couple of ways to achieve this.

  1. Remote docker containers. We can use the CI to trigger spinning up the infra, or multi-node docker containers in a remote k8s cluster, and then run tests against the setup to perform the chain upgrades or multi-node tests. This is something Starship is capable and built for.
  2. Bring in the same paradigms of docker based upgrade testing directly as a system test as you mention. The idea here would be, we use something like cosmovisor to run the genesis chains and have the upgrade binaries in upgrade dirs, and have the system controller run and trigger upgrades.
  3. Just do it ourselves directly in the system test runner without cosmovisor. We could just call the upgrade keepers if we have access to App itself.

If you wouldn't mind, I can try out both the approaches too. It would be really nice to be able to have system tests be run against and actual e2e testing framework, both InterchainTest and Starship.

@alpe
Copy link
Contributor Author

alpe commented Jul 18, 2023

Thanks @Anmol1696 for your thoughts and proposals!
I would prefer if we can focus on option 3) to have a slim setup and expand on this later. No need for cosmovisor as we can fully control the nodes from Go. There may be some updates required to the system.go but it should not be too complicated to stop a chain on some height and restart it with a new binary.
The system tests do not have access to the upgrade keeper. Everything has to go through the system as black box. Therefore some modifications to the gov setup in genesis would be required for short voting periods.

@Anmol1696
Copy link

Yup that would make sense. Could we define the scope of the system test though?
You need multi-node, upgradeable system tests. I will be working on something very similar for Starship as well, so maybe I can have a look at this. What is the priority and timeline for this?

I think you are right about flaky docker setups, even with Starship local testing, docker and k8s both seem to be an overkill. But I think we should be careful to limit the scope, so that other e2e frameworks can take over from the limits of this system.

@alpe
Copy link
Contributor Author

alpe commented Jul 18, 2023

I would like to cover the chain upgrade use case fully by a Go system tests before making this more extendable. In the best case the server side can be easily swapped in the future but we have a concrete problem now and a goal to reduce time to market for wasmd and the SDK.

I had outlined the basic idea in the description already but some concrete steps for the system test in Go would be:

  1. Preparation - fetch:
  • legacy wasmd binary version n (from GH release/ circleCI archive)
  • state dump genesis for wasmd (some repo/ circleCI archive/ URL/ ???) that works with this binary n
  1. Modify genesis for system under test
  • setup valset and validator accounts
  • setup a minimal voting period for gov
  1. Launch multi node chain with wasmd binary version n and genesis
  2. Deploy test contract
  3. for i:=1; i <= current version; i++
    5a) block x := current + buffer
    5b) submit gov proposal and votes for chain upgrade n+i at block x
    5c) stop chain at block x + 1
    5d) restart chain with new binary n + i
  4. Smoke tests
  • ensure smart queries do still work
  • ensure example contract can be executed
  • ???

The process focus is on happy path only. A multi node setup is preferred to help catching non deterministic behaviour on the migration.

@Reecepbcups
Copy link

Reecepbcups commented Jul 18, 2023

yea this is exactly what we do for Juno with ictest already (SDK 45 -> v47, 4 vals + 4 full nodes).

Only would have to add a genesisDump() method at the end and add contract executes & wasmd cli queries

already auto builds latest every commit to a docker repo

if you guys change your mind (from System go) just let me know :)

https://github.com/CosmosContracts/juno/blob/main/interchaintest/chain_upgrade_test.go

@alpe alpe modified the milestone: v0.51 Oct 18, 2023
@alpe
Copy link
Contributor Author

alpe commented Oct 18, 2023

Testing a chain upgrade from a rich state dump is still open. The problem here is to stay vendor neutral and storing the dump. Chains should have their own tooling in place to verify their migrations.
Nevertheless, we can revisit this when we have another wasmd testnet.

@alpe alpe closed this as completed Oct 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants