Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: stop bot in case of tx errors #9421

Merged
merged 12 commits into from
Oct 30, 2024
Merged

fix: stop bot in case of tx errors #9421

merged 12 commits into from
Oct 30, 2024

Conversation

alexghr
Copy link
Contributor

@alexghr alexghr commented Oct 25, 2024

This PR contains a couple of fixes for the network occasionally stopping when run for a longer period of time:

  • it restarts the bot when the chain gets pruned. This is necessary because the PXE does not react to reorgs. When one happens the bot's PXE's database is deleted and it starts again
  • it handles reorgs in the p2p-client such that it's able to consume the new chain after a prune.

@alexghr alexghr force-pushed the ag/restart-bot branch 2 times, most recently from 0a5d17d to cbbce08 Compare October 25, 2024 23:31
@alexghr alexghr added the e2e-all CI: Enables this CI job. label Oct 26, 2024
@@ -103,8 +105,8 @@ describe('In-Memory P2P Client', () => {
await client.start();
expect(client.isReady()).toEqual(true);

await client.stop();
expect(client.isReady()).toEqual(false);
// await client.stop();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be commented out?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I commented out for debugging. Will restore

yarn-project/p2p/src/client/p2p_client.ts Outdated Show resolved Hide resolved
@alexghr alexghr force-pushed the ag/restart-bot branch 2 times, most recently from ef7e044 to fe99e28 Compare October 28, 2024 13:51
}

if (!this.healthy && this.config.stopWhenUnhealthy) {
this.log.error(`Stopping bot due to errors`);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need a fatal log level :-P

@@ -10,6 +10,8 @@ export class BotRunner {
private pxe?: PXE;
private node: AztecNode;
private runningPromise: RunningPromise;
private consecutiveErrors = 0;
private healthy = true;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I think we can do without the healthy property and just recompute it on the fly based on consecutiveErrors and config

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to tackle this @alexghr?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be fair it's a nitpick, fine for me if it goes untackled

yarn-project/p2p/src/client/p2p_client.ts Show resolved Hide resolved
yarn-project/p2p/src/client/p2p_client.ts Outdated Show resolved Hide resolved
yarn-project/p2p/src/client/p2p_client.ts Outdated Show resolved Hide resolved
Comment on lines 627 to 660
// everything left in the mined set was built against a block on the proven chain so its still valid
// move back to pending set
await this.txPool.markMinedAsPending(this.txPool.getMinedTxHashes());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we moving all mined txs to pending? We should only move the ones that were mined in a block after latestBlock, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My logic is: the set of mined transactions at the point of a chain prune will be made up of two subsets: the set of all invalid transactions (ie. those that were built against a block that no longer exists) and everything else.

The "everything else" set will contain all transactions that are still good (i.e. built from a finalized block) but were mined in a pruned block so should be safe to re-add to pending as every other transaction was either removed or finalised.

WDYT?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic seems reasonable I think doesn't it @spalladino?

Copy link
Collaborator

@spalladino spalladino Oct 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it works only if keepProvenTxsFor is set to zero. Otherwise, the set of mined txs will include some txs that are on the proven chain, which should not get pruned, and so should not be moved to pending. But to be fair, we had only included keepProvenTxsFor to account for scenarios where a prover may want to prove something already proven (ie provernet v1), so it is no longer relevant.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note this may change when we add support for L1 reorgs. We'll want to clear txs from the p2p pool once they are finalized (in the sense of them being in a block that is proven and L1 finalized) and not just proven. Then this line would have to change to sth like markMinedAsPending(txPool.getMinedButNotProvenTxHashes()).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, sounds like we can go ahead and merge and deal with that condition later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed this in the last commit to track which block mined a tx. This should also cover the case where we hold onto txs for longer than the prove chain (this.keepProvenTxsFor)

@spalladino
Copy link
Collaborator

Also, once merged, we should tackle #9316

@alexghr alexghr enabled auto-merge (squash) October 30, 2024 13:21
@alexghr alexghr merged commit 6650641 into master Oct 30, 2024
98 checks passed
@alexghr alexghr deleted the ag/restart-bot branch October 30, 2024 14:57
rahul-kothari pushed a commit that referenced this pull request Nov 1, 2024
🤖 I have created a release *beep* *boop*
---


<details><summary>aztec-package: 0.62.0</summary>

##
[0.62.0](aztec-package-v0.61.0...aztec-package-v0.62.0)
(2024-11-01)


### Features

* Token private mint optimization
([#9606](#9606))
([e8fadc7](e8fadc7))


### Bug Fixes

* **k8s:** Boot node long sync
([#9610](#9610))
([1b85840](1b85840))
* Multi-node metrics working
([#9486](#9486))
([fd974e1](fd974e1))
* Stop bot in case of tx errors
([#9421](#9421))
([6650641](6650641))


### Miscellaneous

* Replacing unshield naming with transfer_to_public
([#9608](#9608))
([247e9eb](247e9eb))
* Token partial notes refactor pt. 1
([#9490](#9490))
([3d631f5](3d631f5))
</details>

<details><summary>barretenberg.js: 0.62.0</summary>

##
[0.62.0](barretenberg.js-v0.61.0...barretenberg.js-v0.62.0)
(2024-11-01)


### Features

* Faster square roots
([#2694](#2694))
([722ec5c](722ec5c))
</details>

<details><summary>aztec-packages: 0.62.0</summary>

##
[0.62.0](aztec-packages-v0.61.0...aztec-packages-v0.62.0)
(2024-11-01)


### ⚠ BREAKING CHANGES

* **avm:** use 32 bit locations
([#9596](#9596))
* Unique L1 to L2 messages
([#9492](#9492))

### Features

* Add increment secret oracles
([#9573](#9573))
([97a4c0c](97a4c0c))
* **avm:** Use 32 bit locations
([#9596](#9596))
([5f38696](5f38696))
* Barebones addressbook for tagging
([#9572](#9572))
([6526069](6526069))
* Biggroup_goblin handles points at infinity + 1.8x reduction in ECCVM
size
([#9366](#9366))
([9211d8a](9211d8a))
* Faster square roots
([#2694](#2694))
([722ec5c](722ec5c))
* Fixed private log size
([#9585](#9585))
([755c70a](755c70a))
* Removing register recipient in e2e tests as it is unnecessary now !
([#9499](#9499))
([9f52cbb](9f52cbb))
* Reorg test
([#9607](#9607))
([54488b3](54488b3))
* Simulate validateEpochProofQuoteHeader in the future
([#9641](#9641))
([284c8f8](284c8f8))
* Spartan proving
([#9584](#9584))
([392114a](392114a))
* Sync tagged logs
([#9595](#9595))
([0cc4a48](0cc4a48))
* Token private mint optimization
([#9606](#9606))
([e8fadc7](e8fadc7))
* Unique L1 to L2 messages
([#9492](#9492))
([4e5ae95](4e5ae95)),
closes
[#9450](#9450)


### Bug Fixes

* E2e event logs test
([#9621](#9621))
([737c573](737c573))
* E2e labels
([#9609](#9609))
([ed1deb9](ed1deb9))
* Ensuring translator range constraint polynomials are zeroes outside of
minicircuit
([#9251](#9251))
([04dd2c4](04dd2c4))
* EventMetadata class implementation for serialisation
([#9574](#9574))
([bdff73a](bdff73a))
* Force bb-sanitizers true
([#9614](#9614))
([39cda86](39cda86))
* **k8s:** Boot node long sync
([#9610](#9610))
([1b85840](1b85840))
* Multi-node metrics working
([#9486](#9486))
([fd974e1](fd974e1))
* Remove all register recipient functionality in ts
([#9548](#9548))
([2f7127b](2f7127b))
* Remove unnecessary ivpk references in ts
([#9463](#9463))
([0c5121f](0c5121f))
* Resolution of bugs from bigfield audits
([#9547](#9547))
([feace70](feace70))
* Stop bot in case of tx errors
([#9421](#9421))
([6650641](6650641))
* Typing of artifacts
([#9581](#9581))
([c71645f](c71645f))


### Miscellaneous

* Add guides to get_e2e_jobs.sh
([#9624](#9624))
([8891ead](8891ead))
* Add sender to encode and encrypt
([#9562](#9562))
([8ce6834](8ce6834))
* Add signed int deserialization to decoder
([#9557](#9557))
([0435d00](0435d00))
* Bb sanitizers on master
([#9564](#9564))
([747bff1](747bff1))
* Cleaning up token test utils
([#9633](#9633))
([325bdb0](325bdb0))
* Disable breaking e2e_event_logs test
([#9602](#9602))
([cf2ca2e](cf2ca2e))
* Dont generate vks for simulated circuits
([#9625](#9625))
([366eff3](366eff3))
* Fixing broken sample-dapp tests
([#9597](#9597))
([5e52900](5e52900))
* Nuking `Token::privately_mint_private_note(...)`
([#9616](#9616))
([bf53f5e](bf53f5e))
* Pass on docker_fast.sh
([#9615](#9615))
([1c53459](1c53459))
* Remove outgoing tagging field in logs
([#9502](#9502))
([c473380](c473380))
* Replace relative paths to noir-protocol-circuits
([288099b](288099b))
* Replacing unshield naming with transfer_to_public
([#9608](#9608))
([247e9eb](247e9eb))
* Token partial notes refactor pt. 1
([#9490](#9490))
([3d631f5](3d631f5))
</details>

<details><summary>barretenberg: 0.62.0</summary>

##
[0.62.0](barretenberg-v0.61.0...barretenberg-v0.62.0)
(2024-11-01)


### ⚠ BREAKING CHANGES

* **avm:** use 32 bit locations
([#9596](#9596))

### Features

* **avm:** Use 32 bit locations
([#9596](#9596))
([5f38696](5f38696))
* Biggroup_goblin handles points at infinity + 1.8x reduction in ECCVM
size
([#9366](#9366))
([9211d8a](9211d8a))
* Faster square roots
([#2694](#2694))
([722ec5c](722ec5c))
* Spartan proving
([#9584](#9584))
([392114a](392114a))


### Bug Fixes

* Ensuring translator range constraint polynomials are zeroes outside of
minicircuit
([#9251](#9251))
([04dd2c4](04dd2c4))
* Resolution of bugs from bigfield audits
([#9547](#9547))
([feace70](feace70))


### Miscellaneous

* Bb sanitizers on master
([#9564](#9564))
([747bff1](747bff1))
* Pass on docker_fast.sh
([#9615](#9615))
([1c53459](1c53459))
</details>

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).
AztecBot added a commit to AztecProtocol/barretenberg that referenced this pull request Nov 2, 2024
🤖 I have created a release *beep* *boop*
---


<details><summary>aztec-package: 0.62.0</summary>

##
[0.62.0](AztecProtocol/aztec-packages@aztec-package-v0.61.0...aztec-package-v0.62.0)
(2024-11-01)


### Features

* Token private mint optimization
([#9606](AztecProtocol/aztec-packages#9606))
([e8fadc7](AztecProtocol/aztec-packages@e8fadc7))


### Bug Fixes

* **k8s:** Boot node long sync
([#9610](AztecProtocol/aztec-packages#9610))
([1b85840](AztecProtocol/aztec-packages@1b85840))
* Multi-node metrics working
([#9486](AztecProtocol/aztec-packages#9486))
([fd974e1](AztecProtocol/aztec-packages@fd974e1))
* Stop bot in case of tx errors
([#9421](AztecProtocol/aztec-packages#9421))
([6650641](AztecProtocol/aztec-packages@6650641))


### Miscellaneous

* Replacing unshield naming with transfer_to_public
([#9608](AztecProtocol/aztec-packages#9608))
([247e9eb](AztecProtocol/aztec-packages@247e9eb))
* Token partial notes refactor pt. 1
([#9490](AztecProtocol/aztec-packages#9490))
([3d631f5](AztecProtocol/aztec-packages@3d631f5))
</details>

<details><summary>barretenberg.js: 0.62.0</summary>

##
[0.62.0](AztecProtocol/aztec-packages@barretenberg.js-v0.61.0...barretenberg.js-v0.62.0)
(2024-11-01)


### Features

* Faster square roots
([#2694](AztecProtocol/aztec-packages#2694))
([722ec5c](AztecProtocol/aztec-packages@722ec5c))
</details>

<details><summary>aztec-packages: 0.62.0</summary>

##
[0.62.0](AztecProtocol/aztec-packages@aztec-packages-v0.61.0...aztec-packages-v0.62.0)
(2024-11-01)


### ⚠ BREAKING CHANGES

* **avm:** use 32 bit locations
([#9596](AztecProtocol/aztec-packages#9596))
* Unique L1 to L2 messages
([#9492](AztecProtocol/aztec-packages#9492))

### Features

* Add increment secret oracles
([#9573](AztecProtocol/aztec-packages#9573))
([97a4c0c](AztecProtocol/aztec-packages@97a4c0c))
* **avm:** Use 32 bit locations
([#9596](AztecProtocol/aztec-packages#9596))
([5f38696](AztecProtocol/aztec-packages@5f38696))
* Barebones addressbook for tagging
([#9572](AztecProtocol/aztec-packages#9572))
([6526069](AztecProtocol/aztec-packages@6526069))
* Biggroup_goblin handles points at infinity + 1.8x reduction in ECCVM
size
([#9366](AztecProtocol/aztec-packages#9366))
([9211d8a](AztecProtocol/aztec-packages@9211d8a))
* Faster square roots
([#2694](AztecProtocol/aztec-packages#2694))
([722ec5c](AztecProtocol/aztec-packages@722ec5c))
* Fixed private log size
([#9585](AztecProtocol/aztec-packages#9585))
([755c70a](AztecProtocol/aztec-packages@755c70a))
* Removing register recipient in e2e tests as it is unnecessary now !
([#9499](AztecProtocol/aztec-packages#9499))
([9f52cbb](AztecProtocol/aztec-packages@9f52cbb))
* Reorg test
([#9607](AztecProtocol/aztec-packages#9607))
([54488b3](AztecProtocol/aztec-packages@54488b3))
* Simulate validateEpochProofQuoteHeader in the future
([#9641](AztecProtocol/aztec-packages#9641))
([284c8f8](AztecProtocol/aztec-packages@284c8f8))
* Spartan proving
([#9584](AztecProtocol/aztec-packages#9584))
([392114a](AztecProtocol/aztec-packages@392114a))
* Sync tagged logs
([#9595](AztecProtocol/aztec-packages#9595))
([0cc4a48](AztecProtocol/aztec-packages@0cc4a48))
* Token private mint optimization
([#9606](AztecProtocol/aztec-packages#9606))
([e8fadc7](AztecProtocol/aztec-packages@e8fadc7))
* Unique L1 to L2 messages
([#9492](AztecProtocol/aztec-packages#9492))
([4e5ae95](AztecProtocol/aztec-packages@4e5ae95)),
closes
[#9450](AztecProtocol/aztec-packages#9450)


### Bug Fixes

* E2e event logs test
([#9621](AztecProtocol/aztec-packages#9621))
([737c573](AztecProtocol/aztec-packages@737c573))
* E2e labels
([#9609](AztecProtocol/aztec-packages#9609))
([ed1deb9](AztecProtocol/aztec-packages@ed1deb9))
* Ensuring translator range constraint polynomials are zeroes outside of
minicircuit
([#9251](AztecProtocol/aztec-packages#9251))
([04dd2c4](AztecProtocol/aztec-packages@04dd2c4))
* EventMetadata class implementation for serialisation
([#9574](AztecProtocol/aztec-packages#9574))
([bdff73a](AztecProtocol/aztec-packages@bdff73a))
* Force bb-sanitizers true
([#9614](AztecProtocol/aztec-packages#9614))
([39cda86](AztecProtocol/aztec-packages@39cda86))
* **k8s:** Boot node long sync
([#9610](AztecProtocol/aztec-packages#9610))
([1b85840](AztecProtocol/aztec-packages@1b85840))
* Multi-node metrics working
([#9486](AztecProtocol/aztec-packages#9486))
([fd974e1](AztecProtocol/aztec-packages@fd974e1))
* Remove all register recipient functionality in ts
([#9548](AztecProtocol/aztec-packages#9548))
([2f7127b](AztecProtocol/aztec-packages@2f7127b))
* Remove unnecessary ivpk references in ts
([#9463](AztecProtocol/aztec-packages#9463))
([0c5121f](AztecProtocol/aztec-packages@0c5121f))
* Resolution of bugs from bigfield audits
([#9547](AztecProtocol/aztec-packages#9547))
([feace70](AztecProtocol/aztec-packages@feace70))
* Stop bot in case of tx errors
([#9421](AztecProtocol/aztec-packages#9421))
([6650641](AztecProtocol/aztec-packages@6650641))
* Typing of artifacts
([#9581](AztecProtocol/aztec-packages#9581))
([c71645f](AztecProtocol/aztec-packages@c71645f))


### Miscellaneous

* Add guides to get_e2e_jobs.sh
([#9624](AztecProtocol/aztec-packages#9624))
([8891ead](AztecProtocol/aztec-packages@8891ead))
* Add sender to encode and encrypt
([#9562](AztecProtocol/aztec-packages#9562))
([8ce6834](AztecProtocol/aztec-packages@8ce6834))
* Add signed int deserialization to decoder
([#9557](AztecProtocol/aztec-packages#9557))
([0435d00](AztecProtocol/aztec-packages@0435d00))
* Bb sanitizers on master
([#9564](AztecProtocol/aztec-packages#9564))
([747bff1](AztecProtocol/aztec-packages@747bff1))
* Cleaning up token test utils
([#9633](AztecProtocol/aztec-packages#9633))
([325bdb0](AztecProtocol/aztec-packages@325bdb0))
* Disable breaking e2e_event_logs test
([#9602](AztecProtocol/aztec-packages#9602))
([cf2ca2e](AztecProtocol/aztec-packages@cf2ca2e))
* Dont generate vks for simulated circuits
([#9625](AztecProtocol/aztec-packages#9625))
([366eff3](AztecProtocol/aztec-packages@366eff3))
* Fixing broken sample-dapp tests
([#9597](AztecProtocol/aztec-packages#9597))
([5e52900](AztecProtocol/aztec-packages@5e52900))
* Nuking `Token::privately_mint_private_note(...)`
([#9616](AztecProtocol/aztec-packages#9616))
([bf53f5e](AztecProtocol/aztec-packages@bf53f5e))
* Pass on docker_fast.sh
([#9615](AztecProtocol/aztec-packages#9615))
([1c53459](AztecProtocol/aztec-packages@1c53459))
* Remove outgoing tagging field in logs
([#9502](AztecProtocol/aztec-packages#9502))
([c473380](AztecProtocol/aztec-packages@c473380))
* Replace relative paths to noir-protocol-circuits
([288099b](AztecProtocol/aztec-packages@288099b))
* Replacing unshield naming with transfer_to_public
([#9608](AztecProtocol/aztec-packages#9608))
([247e9eb](AztecProtocol/aztec-packages@247e9eb))
* Token partial notes refactor pt. 1
([#9490](AztecProtocol/aztec-packages#9490))
([3d631f5](AztecProtocol/aztec-packages@3d631f5))
</details>

<details><summary>barretenberg: 0.62.0</summary>

##
[0.62.0](AztecProtocol/aztec-packages@barretenberg-v0.61.0...barretenberg-v0.62.0)
(2024-11-01)


### ⚠ BREAKING CHANGES

* **avm:** use 32 bit locations
([#9596](AztecProtocol/aztec-packages#9596))

### Features

* **avm:** Use 32 bit locations
([#9596](AztecProtocol/aztec-packages#9596))
([5f38696](AztecProtocol/aztec-packages@5f38696))
* Biggroup_goblin handles points at infinity + 1.8x reduction in ECCVM
size
([#9366](AztecProtocol/aztec-packages#9366))
([9211d8a](AztecProtocol/aztec-packages@9211d8a))
* Faster square roots
([#2694](AztecProtocol/aztec-packages#2694))
([722ec5c](AztecProtocol/aztec-packages@722ec5c))
* Spartan proving
([#9584](AztecProtocol/aztec-packages#9584))
([392114a](AztecProtocol/aztec-packages@392114a))


### Bug Fixes

* Ensuring translator range constraint polynomials are zeroes outside of
minicircuit
([#9251](AztecProtocol/aztec-packages#9251))
([04dd2c4](AztecProtocol/aztec-packages@04dd2c4))
* Resolution of bugs from bigfield audits
([#9547](AztecProtocol/aztec-packages#9547))
([feace70](AztecProtocol/aztec-packages@feace70))


### Miscellaneous

* Bb sanitizers on master
([#9564](AztecProtocol/aztec-packages#9564))
([747bff1](AztecProtocol/aztec-packages@747bff1))
* Pass on docker_fast.sh
([#9615](AztecProtocol/aztec-packages#9615))
([1c53459](AztecProtocol/aztec-packages@1c53459))
</details>

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
e2e-all CI: Enables this CI job.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants