Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bonds for Fortified DSMR #1835

Merged
merged 16 commits into from
Dec 23, 2024
Merged

Bonds for Fortified DSMR #1835

merged 16 commits into from
Dec 23, 2024

Conversation

joshua-kim
Copy link
Contributor

@joshua-kim joshua-kim commented Dec 16, 2024

Implements account bonding for DSMR. HyperSDK will implement the Bonder interface which will keep track of account bond balances.

  1. Txs are bonded when a node builds a tx into a chunk
  2. Txs are un-bonded when the node either accepts a tx that it was tracking, or when the tx expiry passes.

This PR includes a few changes to the dsmr package to implement this, namely:

  1. Test code common across dsmr and fdsmr are moved to dsmrtest
  2. Accept now returns the outputted block with chunk details, because these details (i.e txs) are needed to unbond txs in fdsmr.
  3. Because Accept returns the outputted block, we no longer need BuildChunk to return chunks/chunk certs to the caller, which was previously exposed for testing (bad practice).

@joshua-kim joshua-kim force-pushed the bond branch 9 times, most recently from ae3d7a3 to a41b0a8 Compare December 18, 2024 05:44
@joshua-kim joshua-kim marked this pull request as ready for review December 18, 2024 05:44
Signed-off-by: Joshua Kim <20001595+joshua-kim@users.noreply.github.com>
@@ -118,9 +126,9 @@ func (n *Node[T]) BuildChunk(
txs []T,
expiry int64,
beneficiary codec.Address,
) (Chunk[T], ChunkCertificate, error) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was so gross previously (and I'm to blame...), we previously exposed the chunk/chunk certs that were created into the caller. Realistically this is just leaking implementation details because hypersdk would never care about the chunks that were built - since the mempool/consensus only cares about blocks.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fwiw - I think it's fine to return these values or not export this function (could just expose Start(ctx, ...) that continuously builds chunks), but also fine to return the value if it makes writing tests easier

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The caller should only rely on some way to make sure chunks are continuously produced (assuming it's a validator node) until shutdown

Comment on lines -983 to -992

for _, chunk := range wantChunks {
found := false
for _, chunkCert := range blk.ChunkCerts {
if chunkCert.ChunkID == chunk.id {
found = true
}
}
r.True(found)
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't check the chunk ids (and just chunk # of chunks now) because the caller shouldn't care what the chunk id is (as it just provides txs to build into a chunk).

Height: parent.Height + 1,
Timestamp: parent.Timestamp + 1,
},
ChunkCerts: parent.ChunkCerts,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going over this - I think this is either weird/or a bug that we allow duplicate chunk certs to pass verification. For now i'm doing this because it passes tests and it's convenient - but probably something to fix as a follow-up.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bug fixed in #1810. We should not allow duplicates either in a block or across the ancestry.

@@ -118,9 +126,9 @@ func (n *Node[T]) BuildChunk(
txs []T,
expiry int64,
beneficiary codec.Address,
) (Chunk[T], ChunkCertificate, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fwiw - I think it's fine to return these values or not export this function (could just expose Start(ctx, ...) that continuously builds chunks), but also fine to return the value if it makes writing tests easier

@@ -118,9 +126,9 @@ func (n *Node[T]) BuildChunk(
txs []T,
expiry int64,
beneficiary codec.Address,
) (Chunk[T], ChunkCertificate, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The caller should only rely on some way to make sure chunks are continuously produced (assuming it's a validator node) until shutdown

x/dsmr/node.go Outdated
Comment on lines 50 to 55
type Interface[T Tx] interface {
BuildChunk(ctx context.Context, txs []T, expiry int64, beneficiary codec.Address) error
BuildBlock(parent Block, timestamp int64) (Block, error)
Verify(ctx context.Context, parent Block, block Block) error
Accept(ctx context.Context, block Block) (ExecutedBlock[T], error)
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the rationale for defining the interface this way? Asking because we've previously discussed removing defining interfaces over a single implementation

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about this last night and originally felt this made sense so that we could keep wrapping dsmr with more features, but in hindsight maybe it makes more sense for this to live in the fdsmr package and say that dsmr implements it (i.e dsmr is fortifiable).

Another idea is to make fdsmr rely on the concrete type of dsmr since we only have one implementation but this makes testing look weirder.

x/fdsmr/node.go Show resolved Hide resolved
x/fdsmr/node.go Outdated
Comment on lines 43 to 46
if len(txs) == 0 {
return n.DSMR.BuildChunk(ctx, txs, expiry, beneficiary)
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we remove this check since it will be handled the same way (slice of size 0 and skip the loop) regardless of if it's here?

Copy link
Contributor Author

@joshua-kim joshua-kim Dec 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a very subtle difference... this is to handle this test case

{
name: "nil txs",
bonder: testBonder{},
},
{
name: "empty txs",
bonder: testBonder{},
txs: []dsmrtest.Tx{},
wantTxs: []dsmrtest.Tx{},
},
. We currently guarantee that nil txs are wired into the underlying dsmr as nil as well (instead of a zero-length non-nil slice). This is probably overkill but I thought it made sense to be consistent w/ the caller value.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to treat these the same and require the callee handles them the same as well instead of making sure to guarantee it here

x/fdsmr/node.go Outdated Show resolved Hide resolved
x/fdsmr/node.go Outdated
Comment on lines 115 to 118
if !b.pending.Contains(tx.GetID()) {
// this tx was either not assigned to us or was already un-bonded
return
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When do we expect to hit this case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might not understand something but this is because we call Unbond on every tx in the block and only call the underlying bonder if it was a tx that we knew about. I think this is needed because we're not guaranteed that every tx in a block was handled by us.

Height: parent.Height + 1,
Timestamp: parent.Timestamp + 1,
},
ChunkCerts: parent.ChunkCerts,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bug fixed in #1810. We should not allow duplicates either in a block or across the ancestry.

x/fdsmr/node_test.go Outdated Show resolved Hide resolved
Signed-off-by: Joshua Kim <20001595+joshua-kim@users.noreply.github.com>
Signed-off-by: Joshua Kim <20001595+joshua-kim@users.noreply.github.com>
Signed-off-by: Joshua Kim <20001595+joshua-kim@users.noreply.github.com>
Signed-off-by: Joshua Kim <20001595+joshua-kim@users.noreply.github.com>
Signed-off-by: Joshua Kim <20001595+joshua-kim@users.noreply.github.com>
Signed-off-by: Joshua Kim <20001595+joshua-kim@users.noreply.github.com>
Signed-off-by: Joshua Kim <20001595+joshua-kim@users.noreply.github.com>
Signed-off-by: Joshua Kim <20001595+joshua-kim@users.noreply.github.com>
aaronbuchwald
aaronbuchwald previously approved these changes Dec 20, 2024
x/fdsmr/node.go Outdated
Comment on lines 43 to 46
if len(txs) == 0 {
return n.DSMR.BuildChunk(ctx, txs, expiry, beneficiary)
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to treat these the same and require the callee handles them the same as well instead of making sure to guarantee it here

x/fdsmr/node.go Outdated Show resolved Hide resolved
x/fdsmr/node.go Outdated Show resolved Hide resolved
Signed-off-by: Joshua Kim <20001595+joshua-kim@users.noreply.github.com>
Signed-off-by: Joshua Kim <20001595+joshua-kim@users.noreply.github.com>
Signed-off-by: Joshua Kim <20001595+joshua-kim@users.noreply.github.com>
@joshua-kim joshua-kim enabled auto-merge (squash) December 23, 2024 17:51
Signed-off-by: Joshua Kim <20001595+joshua-kim@users.noreply.github.com>
Signed-off-by: Joshua Kim <20001595+joshua-kim@users.noreply.github.com>
Signed-off-by: Joshua Kim <20001595+joshua-kim@users.noreply.github.com>
x/fdsmr/node.go Outdated
}
}

n.pending.SetMin(block.Timestamp)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SetMin returns the elements that were removed, could we use that instead of manually peeking/popping the elements to check if we need to call Unbond ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DIdn't know this return value existed, pretty nice!

Signed-off-by: Joshua Kim <20001595+joshua-kim@users.noreply.github.com>
bonded = append(bonded, tx)
}

return n.DSMR.BuildChunk(ctx, bonded, expiry, beneficiary)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(not necessary for this PR) - is BuildChunk returning an error considered a fatal error? If not, we need to unbond any transactions that were bonded before building the chunk failed

@aaronbuchwald aaronbuchwald merged commit c371a78 into main Dec 23, 2024
17 checks passed
@aaronbuchwald aaronbuchwald deleted the bond branch December 23, 2024 22:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants