Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support new Lotus API + calibration network #529

Merged
merged 25 commits into from
Aug 11, 2020
Merged

Conversation

jsign
Copy link
Contributor

@jsign jsign commented Jul 20, 2020

This PR targets the ntwk-calibration Lotus branch, which is a newer codebase with breaking APIs, this means it can't coexist supporting testnet nor current nerpa. ntwk-calibration code will be the next nerpa network.

The command make up now prints a message saying that current Testnet isn't supported since is still running a Lotus version with old broken APIs. Whenever is updated we can re-enable that. The message suggests running make nerpa-up so to help the user connect to some network.


Apart from changes in this repo, there were many changes in others. In particular, upgrading textileio/lotus-devnet and textileio/lotus-build. In the process of upgrading lotus-devnet which is a pillar of the CI, we've discovered multiple bugs which ended up being resolved: e.g: here, here, and here.

Considering that lotus@master is still targeting testnet, I made textileio/lotus-devnet have multiple branches per network. So, the newest one which Powergate is using now is lotus-devnet@ntwk-calibration which is the localnet running the latest code of lotus. Whenever master has the latest code, we might abandon that branch.


While updating Powergate to the latest Lotus APIs, I took advantage and did other changes:

  • Avoid panicking scenarios in test favoring test failing.
  • Gas limits were unset in sending txn since now new logic about gas-limits exists in Lotus. This means that max-gas and gas-price will be estimated by the Lotus node using stat data.
  • New Lotus code for localnet seems heavier. I split the integration tests in multiple packages to we can cache passing results better to rerun if some flaky-test fails. The CI is a bit painful sometimes. I also created a helper artifact that retries a test N times before failing, this reduced a lot flakyness and need to rerun.
  • As a side note, the localnet with bigsectors is very unstable because some extra changes are needed in sector-storage. For more info see this and the mentioned PR there. I guess eventually that would be totally fixed. Not much to do now.
  • The network has defined now a minimum deal duration which is 180 days. This is now a constant for Powergate that is used to check deal duration validations.
  • I changed the chain speed for localnet to be faster since now deals takes more epoch to finish. Now with one block per 500ms, they take less than 2 min (with big sectors).

@jsign jsign added this to the Sprint 41 milestone Jul 20, 2020
@jsign jsign self-assigned this Jul 20, 2020
@jsign jsign changed the title Support Butterfly network Support next nerpa network Jul 22, 2020
@jsign jsign force-pushed the jsign/networkbutter branch 3 times, most recently from d93c590 to 1077b6b Compare July 23, 2020 13:36
@jsign jsign force-pushed the jsign/networkbutter branch 2 times, most recently from 20c333b to c72b696 Compare July 30, 2020 17:24
@jsign jsign modified the milestones: Sprint 41, Sprint 44 Jul 30, 2020
@jsign jsign modified the milestones: Sprint 44, Sprint 42 Jul 31, 2020
@jsign jsign force-pushed the jsign/networkbutter branch 5 times, most recently from cb64afc to 76252c6 Compare August 6, 2020 17:36
@jsign jsign force-pushed the jsign/networkbutter branch 2 times, most recently from 7a12351 to 76430af Compare August 8, 2020 20:57
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
@jsign jsign changed the title Support next nerpa network Support new Lotus API + calibration network Aug 10, 2020
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
@jsign jsign modified the milestones: Sprint 42, Sprint 43 Aug 11, 2020
Comment on lines -130 to +131
DealMinDuration: 1000,
DealMinDuration: util.MinDealDuration,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now min deal duration isn't arbitrary for Filecoin, so it's a constant.

Comment on lines +103 to +105
if minDuration < util.MinDealDuration {
return nil, fmt.Errorf("duration %d should be greater or equal to %d", minDuration, util.MinDealDuration)
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Early check that the duration is >= the minimum defined by spec. If it isn't, fail fast.

@@ -116,7 +119,8 @@ func (m *Module) Store(ctx context.Context, waddr string, dataCid cid.Cid, piece
}
params := &api.StartDealParams{
Data: &storagemarket.DataRef{
Root: dataCid,
TransferType: storagemarket.TTGraphsync,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Necessary to have deals working in new Lotus version.

require.NoError(t, err)
require.Len(t, final, nm)
t.Parallel()
tests.RunFlaky(t, func(t *tests.FlakyT) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tests.RunFlaky is a helper that runs a tests and consider that fails if fails 5 times in a row.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome

@@ -0,0 +1,279 @@
package config
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Integration tests got partitioned in their own packages. This to allow if some test fail, only invalidate a subset of them in the test cache. This was more relevant before I did the tests.RunFlaky thing, but I kept this change since provides also better organization; the integration test file was very big.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@@ -0,0 +1,65 @@
package tests
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The FlakyT helper I created to provide some retry at the integration test level.
Might be useful in other projects.

// LaunchDevnetDocker launches the devnet docker image.
func LaunchDevnetDocker(t *testing.T, numMiners int, ipfsMaddr string, mountVolumes bool) *dockertest.Resource {
func LaunchDevnetDocker(t TestingTWithCleanup, numMiners int, ipfsMaddr string, mountVolumes bool) *dockertest.Resource {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This *testing.T to TestinTWithCleanup replacement happened in some places.
*testing.T is a concrete type, and we want LaunchDevnetDocker to work with *testing.T but also work with our "FlakyT" runner. If things would still use *testing.T and call Fail() that would mess up retries.

TestingTWithCleanup has the same interface as *testing.T so the user shouldn't be worried about anything extra. Just do whatever usually would do, and FlakyT will do whatever it needs.

if err != nil {
panic(fmt.Sprintf("couldn't create ipfs-pool: %s", err))
}
require.NoError(t, err)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TestingTWithCleanup is compatible with require, so can be used as "nothing happened".

Comment on lines -91 to +93
From: m.masterAddr,
To: addr,
Value: types.BigInt{Int: m.iAmount},
GasLimit: 1000,
GasPrice: types.NewInt(0),
From: m.masterAddr,
To: addr,
Value: types.BigInt{Int: m.iAmount},
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot happened here while tracking different releases of Lotus code.
To make the story short: this way the Lotus node will decide the most reasonable Gas price considering recent history of gas prices.

@@ -1,19 +1,10 @@
COMPAT_MSG="The current version of Powergate support newer versions than Testnet network. You might be interested in using the Calibration network (make calibration-up)"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some message that gets printed if make up is executed. Will eventually go away when Testnet resets to latest code.

@jsign jsign marked this pull request as ready for review August 11, 2020 13:00
@jsign jsign requested a review from asutula August 11, 2020 13:00
Copy link
Member

@asutula asutula left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lots of changes, but it all makes sense. Probably the crazier changes were in the other repos we depend on here. Nice work.

require.NoError(t, err)
require.Len(t, final, nm)
t.Parallel()
tests.RunFlaky(t, func(t *tests.FlakyT) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome

@@ -0,0 +1,279 @@
package config
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@jsign
Copy link
Contributor Author

jsign commented Aug 11, 2020

Lots of changes, but it all makes sense. Probably the crazier changes were in the other repos we depend on here. Nice work.

You can be totally sure about that 😅

@jsign jsign merged commit 5ee1531 into master Aug 11, 2020
@jsign jsign deleted the jsign/networkbutter branch August 11, 2020 16:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants