perf: mpool: Implement transactionLk to ensure consistent state access in MessagePool #10865

snissn · 2023-05-12T21:08:02Z

Title: Refine MessagePool Locking Mechanism to Optimize RPC Calls Performance

Description:
This commit introduces significant refinements to the locking mechanism in the MessagePool. The primary objective is to reduce the duration that locks impede RPC calls from accessing data, thereby enhancing the overall performance of various MessagePool operations.

Key Changes include:

Introduction of a new lock, transactionLk. This lock plays a crucial role in improving concurrency control in the MessagePool. It works in conjunction with the existing lock (previously known as curTsLk) to ensure proper synchronization when accessing both pending messages and current tipset related data.
The existing lock, curTsLk, has been renamed to stateLk, to better reflect its purpose. This lock's usage has been refined to optimize its performance.
Locking and unlocking procedures have been updated in several functions, including but not limited to CheckMessages, CheckPendingMessages, CheckReplaceMessages, checkMessages, New, TryForEachPendingMessage, addTs, addLoaded, addSkipChecks, GetNonce, Pending, PendingFor, HeadChange, Clear, pruneExcessMessages, and republishPendingMessages.

The integration of these changes aims to improve the efficiency of the MessagePool, particularly in terms of handling RPC calls, and to boost the overall performance of the system.

Related Issues

squashed from #10833

Proposed Changes

Additional Info

Checklist

Before you mark the PR ready for review, please make sure that:

Commits have a clear commit message.
PR title is in the form of of <PR type>: <area>: <change being made>
- example: fix: mempool: Introduce a cache for valid signatures
- PR type: fix, feat, build, chore, ci, docs, perf, refactor, revert, style, test
- area, e.g. api, chain, state, market, mempool, multisig, networking, paych, proving, sealing, wallet, deps
New features have usage guidelines and / or documentation updates in
- Lotus Documentation
- Discussion Tutorials
Tests exist for new functionality or change in behavior
CI is green

…formance Description: This commit introduces significant refinements to the locking mechanism in the MessagePool. The primary objective is to reduce the duration that locks impede RPC calls from accessing data, thereby enhancing the overall performance of various MessagePool operations. Key Changes include: 1. Introduction of a new lock, transactionLk. This lock plays a crucial role in improving concurrency control in the MessagePool. It works in conjunction with the existing lock (previously known as curTsLk) to ensure proper synchronization when accessing both pending messages and current tipset related data. 2. The existing lock, curTsLk, has been renamed to stateLk, to better reflect its purpose. This lock's usage has been refined to optimize its performance. 3. Locking and unlocking procedures have been updated in several functions, including but not limited to CheckMessages, CheckPendingMessages, CheckReplaceMessages, checkMessages, New, TryForEachPendingMessage, addTs, addLoaded, addSkipChecks, GetNonce, Pending, PendingFor, HeadChange, Clear, pruneExcessMessages, and republishPendingMessages. The integration of these changes aims to improve the efficiency of the MessagePool, particularly in terms of handling RPC calls, and to boost the overall performance of the system.

chain/messagepool/check.go

chain/messagepool/messagepool.go

fridrik01 · 2023-05-15T17:29:55Z

chain/messagepool/messagepool.go

+	//
+	// mp.stateLk.Lock()
+	// defer mp.stateLk.Unlock()
+	stateLk sync.RWMutex


It would simplify the code a lot IMO if we could merge the transactionLk/stateLk into one lock or lock object, is that something that you considered?

@snissn What was the resolution to this?

I think this was commented on before @fridrik01 got further into reading the PR, and there was a subsequent comment from him clarifying. It is necessary to have two locks for this PR.

chain/messagepool/messagepool.go

Co-authored-by: Friðrik Ásmundsson <fridrik01@gmail.com>

arajasek

Few more questions after earlier review in #10833

arajasek · 2023-05-29T16:05:32Z

chain/messagepool/messagepool.go

@@ -449,8 +470,8 @@ func New(ctx context.Context, api Provider, ds dtypes.MetadataDS, us stmgr.Upgra
 }

 func (mp *MessagePool) ForEachPendingMessage(f func(cid.Cid) error) error {
-	mp.lk.Lock()
-	defer mp.lk.Unlock()
+	mp.transactionLk.Lock()


I think it's sufficient to Lock mp.stateLk here, since we just need pending to not change under our feet?

Transaction lock needs to be used to safely make changes to state. So changes to the state must be wrapped in a transaction lock. Hope that makes sense! Also the protector is who calls this function so I am pretty sure it wants write access not read access, so i used a write lock here

arajasek · 2023-05-29T16:14:05Z

chain/messagepool/messagepool.go

+	defer mp.transactionLk.Unlock()
+
+	mp.stateLk.RLock()
+	err := mp.checkMessage(ctx, m)


I think we can refactor checkMessage to not need to reference curTs at all -- I can push up a commit that does that, and I think in that case we can change this to only take the stateLk after having called checkMessage.

That sounds great to me! Have you gotten to that yet? Should we set up a new ticket for that?

arajasek · 2023-05-29T16:18:45Z

chain/messagepool/messagepool.go

-func (mp *MessagePool) addTs(ctx context.Context, m *types.SignedMessage, curTs *types.TipSet, local, untrusted bool) (bool, error) {
+func (mp *MessagePool) addTs(ctx context.Context, m *types.SignedMessage, local, untrusted bool) (bool, error) {
+	//ensures that we have a consistent view of the state
+	mp.transactionLk.Lock()


It's not immediately clear to me why we need the transactionLk here, and in some of the subsequent methods, since we aren't modifying curTs. I can totally believe it's necessary, but can you please talk me through it?

Hey! That's a good question! When the tip set changes, the message pool is pruned to remove any messages that were in the blocks:

// caller must hold transactionLk and stateLk func (mp *MessagePool) headChange(ctx context.Context, revert []*types.TipSet, apply []*types.TipSet) error { repubTrigger := false rmsgs := make(map[address.Address]map[uint64]*types.SignedMessage) add := func(m *types.SignedMessage) { s, ok := rmsgs[m.Message.From] if !ok { s = make(map[uint64]*types.SignedMessage) rmsgs[m.Message.From] = s } s[m.Message.Nonce] = m } rm := func(from address.Address, nonce uint64) { s, ok := rmsgs[from] if !ok { mp.remove(ctx, from, nonce, true) return } if _, ok := s[nonce]; ok { delete(s, nonce) return } mp.remove(ctx, from, nonce, true) } ... for _, ts := range apply { mp.curTs = ts for _, b := range ts.Blocks() { bmsgs, smsgs, err := mp.api.MessagesForBlock(ctx, b) if err != nil { xerr := xerrors.Errorf("failed to get messages for apply block %s(height %d) (msgroot = %s): %w", b.Cid(), b.Height, b.Messages, err) log.Errorf("error retrieving messages for block: %s", xerr) merr = multierror.Append(merr, xerr) continue } for _, msg := range smsgs { rm(msg.Message.From, msg.Message.Nonce) maybeRepub(msg.Cid()) } for _, msg := range bmsgs { rm(msg.From, msg.Nonce) maybeRepub(msg.Cid()) } } }

we want to wrap changes to tipset and/or messages in a pair of transaction and state locks because changing the message pool requires a coordination lock on the tipset because the tipset changes the message pool

chain/messagepool/messagepool.go

arajasek · 2023-05-29T16:24:53Z

chain/messagepool/pruning.go

-	mp.curTsLk.Lock()
-	ts := mp.curTs
-	mp.curTsLk.Unlock()
+	mp.transactionLk.Lock()


Also not clear to me that we need transactionLk here, but unsure

same as above - we are changing the message pool so we want to use the transaction and state locks because we are changing either the tipset or the messages

…tus into mikers/mpool/txLock3

Stebalien

While this does simplify things, it also significantly increases the granularity of some of the locks. Are we sure it improves performance?

Stebalien · 2023-06-20T18:59:45Z

chain/messagepool/messagepool.go

+		mp.transactionLk.Lock()
+		defer mp.transactionLk.Unlock()
+
+		mp.stateLk.Lock()
+		defer mp.stateLk.Unlock()
+


We will block sync entirely waiting on both of these locks. Is that fine?

Stebalien · 2023-06-20T19:00:16Z

chain/messagepool/check.go

+	mp.stateLk.RLock()
+	defer mp.stateLk.RUnlock()


This locks a much wider range. Are we sure that's a good idea?

snissn requested a review from a team as a code owner May 12, 2023 21:08

snissn mentioned this pull request May 12, 2023

perf: mpool: Implement transactionLk to ensure consistent state access in MessagePool #10833

Closed

7 tasks

fridrik01 reviewed May 15, 2023

View reviewed changes

snissn and others added 3 commits May 24, 2023 11:07

Update chain/messagepool/messagepool.go

26c315c

Co-authored-by: Friðrik Ásmundsson <fridrik01@gmail.com>

transactionLk can be a Mutex and lk is no longer used

b0f3311

Messagepool: Use latest network version when determining msg validity

50274ff

arajasek reviewed May 29, 2023

View reviewed changes

snissn added 5 commits June 7, 2023 10:57

add comment caller must hold lock

1b15634

Merge branch 'mikers/mpool/txLock3' of github.com:filecoin-project/lo…

4437e75

…tus into mikers/mpool/txLock3

fmt

319a3a7

Merge branch 'master' into mikers/mpool/txLock3

6ee0862

remove mp.lk call, must have regressed from merging master

2ddb26d

Stebalien reviewed Jun 20, 2023

View reviewed changes

rjan90 mentioned this pull request Sep 5, 2023

Monitor performance of MpoolSelect #11233

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: mpool: Implement transactionLk to ensure consistent state access in MessagePool #10865

perf: mpool: Implement transactionLk to ensure consistent state access in MessagePool #10865

snissn commented May 12, 2023

fridrik01 May 15, 2023

arajasek May 29, 2023

snissn Jun 7, 2023

arajasek left a comment

arajasek May 29, 2023

snissn Jun 20, 2023

arajasek May 29, 2023

snissn Jun 20, 2023

arajasek May 29, 2023

snissn Jun 20, 2023

arajasek May 29, 2023

snissn Jun 20, 2023

Stebalien left a comment

Stebalien Jun 20, 2023

Stebalien Jun 20, 2023

perf: mpool: Implement transactionLk to ensure consistent state access in MessagePool #10865

Are you sure you want to change the base?

perf: mpool: Implement transactionLk to ensure consistent state access in MessagePool #10865

Conversation

snissn commented May 12, 2023

Related Issues

Proposed Changes

Additional Info

Checklist

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

arajasek left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Stebalien left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment