Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Address resubmit bug #6886

Merged
merged 5 commits into from
Jul 19, 2019
Merged

Address resubmit bug #6886

merged 5 commits into from
Jul 19, 2019

Conversation

danfinlay
Copy link
Contributor

Added some notes, and one possible approach at a fix, for discussion.

@metamaskbot
Copy link
Collaborator

Builds ready [10268fc]: chrome, firefox, edge, opera

@frankiebee frankiebee added the DO-NOT-MERGE Pull requests that should not be merged label Jul 19, 2019
@metamaskbot
Copy link
Collaborator

Builds ready [ce1784c]: chrome, firefox, edge, opera

@metamaskbot
Copy link
Collaborator

Builds ready [c7f0405]: chrome, firefox, edge, opera

@metamaskbot
Copy link
Collaborator

Builds ready [5c66c72]: chrome, firefox, edge, opera

@metamaskbot
Copy link
Collaborator

Builds ready [21aac10]: chrome, firefox, edge, opera

@danfinlay
Copy link
Contributor Author

danfinlay commented Jul 19, 2019

A user submitted a transaction from their MetaMask, and found it resulted in 20 nearly-identical transactions (apart from the nonce).

In their state logs, those 20 transactions were one, but in that one transaction’s history we could see:

  1. Its initialization
  2. A series of 20 empty arrays.
  3. A series of nonce increments, signatures, and submission attempts.

So it seemed clear that MetaMask had in fact submitted these transactions.

Upon reviewing the transaction controller, Frankie identified the approveTransaction function as a point of interest.

This function performs three critical and related operations:

  • Increment nonce
  • Sign
  • Publish

She also noted that this function is called by the pending-tx-tracker in one place:

if (!('rawTx' in txMeta)) return this.approveTransaction(txMeta.id)

This resubmitTx function is called upon each new block, and because of our “network silent mode”, we call this function repeatedly in a loop when reconnecting to the internet and “catching up” to the current block.

This theory explains the mysterious chain of 23 empty arrays, because the approveTransaction function begins by setting the transaction status to approved, an operation that shows up as an empty array in the history when the tx status is already approved.

When we review the approveTransaction function in this context, it becomes clear that it is filled with asynchronous operations, like getting a nonce, and getting a signature, and publishing, which should not be done while it is being done with another copy of the same transaction.

One question that came up, is why can the pending-tx-tracker sign transactions? It should only possess signed transactions, and should only need network access. The answer was found here:


this.pendingTxTracker = new PendingTransactionTracker({
provider: this.provider,
nonceTracker: this.nonceTracker,
publishTransaction: (rawTx) => this.query.sendRawTransaction(rawTx),
getPendingTransactions: () => {
const pending = this.txStateManager.getPendingTransactions()
const approved = this.txStateManager.getApprovedTransactions()
return [...pending, ...approved]
},
approveTransaction: this.approveTransaction.bind(this),
getCompletedTransactions: this.txStateManager.getConfirmedTransactions.bind(this.txStateManager),
})

It turns out we began passing it approved transactions to mitigate an issue where a system interruption could leave “approved” transactions hanging, never getting signed.

I proposed the simplest solution I could, which was a very simple lock around the approveTransaction function, which ensures it cannot process on a transaction that it is already processing on. I believe this is a simple fix that will alleviate the danger at hand.

There is still room for a better cleanup of this code. The un-sticking of transactions that need to be signed could be broken into its own function, which becomes the part that requires locking, while the pendingTxTracker would no longer need access to a signing method.

That said, this is a scary bug that could result in much worse loss than it did, and so I propose we merge this quick fix for now, and can submit a cleaner fix later.

@danfinlay
Copy link
Contributor Author

In retrospect, it makes sense that this bug was rare, because it required some kind of network interruption in the moment between user approval and transaction signing. It then required the user to regain network access, and MetaMask to rapidly attempt to catch up with blocks all before the transaction's initial nonce could be calculated.

@danfinlay danfinlay marked this pull request as ready for review July 19, 2019 01:48
@danfinlay danfinlay requested a review from frankiebee as a code owner July 19, 2019 01:48
@danfinlay danfinlay removed the DO-NOT-MERGE Pull requests that should not be merged label Jul 19, 2019
Copy link
Contributor

@frankiebee frankiebee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is good as a hot fix

@frankiebee
Copy link
Contributor

would still like one other person to review if possible

@danfinlay danfinlay mentioned this pull request Jul 19, 2019
Copy link
Member

@Gudahtt Gudahtt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent writeup! Seems like a pretty reasonable solution for the time being.

@danfinlay danfinlay merged commit aea54d1 into develop Jul 19, 2019
@danfinlay danfinlay deleted the Fix841034 branch July 19, 2019 03:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants