Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[4/n][gql-performance] Indexer handles partitioning per epoch's worth of tx_sequence_number #18224

Merged

Conversation

wlmyng
Copy link
Contributor

@wlmyng wlmyng commented Jun 12, 2024

Description

To partition an epoch per tx_sequence_number, the last tx_sequence_number of an epoch is the network_total_transactions from the last checkpoint in the current epoch. The first tx_sequence_number is then the network_total_transactions from the last checkpoint in the previous epoch + 1, or the network_total_transactions of the current epoch - epoch_total_transactions of the current epoch.

Since we have network_total_transactions from the checkpoint to commit, and epoch_total_transactions since we fetch current epoch from db, we can derive the range without additional db fetches.

However, this approach uncovered a bug of sorts: when we read the network total txn of epoch X - 2 from the db on the indexing side, we select max(network_total_transactions) from checkpoints where epoch = {epoch}. However, we may not have actually committed epoch X-2's data on the committer side. This is unlikely to happen in prod because it requires the lag between indexing and committing to be >= one epoch. Since an epoch today consists of ~80k checkpoints, it is quite improbably for this to occur. However, in many of our tests it's possible to have as little as one checkpoint and/ or transaction between epochs, which brought this race condition to light.

To resolve, we can place the responsibility on persist_epoch. When we get to a point to persist epoch X - 1, epoch X - 2's data must have been indexed to db.

Test plan

Existing tests


Release notes

Check each box that your changes affect. If none of the boxes relate to your changes, release notes aren't required.

For each box you select, include information after the relevant heading that describes the impact of your changes that a user might notice and any actions they must take to implement updates.

  • Protocol:
  • Nodes (Validators and Full nodes):
  • Indexer:
  • JSON-RPC:
  • GraphQL:
  • CLI:
  • Rust SDK:

Copy link

vercel bot commented Jun 12, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
sui-docs ✅ Ready (Inspect) Visit Preview 💬 Add feedback Jun 18, 2024 7:05pm
3 Ignored Deployments
Name Status Preview Comments Updated (UTC)
multisig-toolkit ⬜️ Ignored (Inspect) Visit Preview Jun 18, 2024 7:05pm
sui-kiosk ⬜️ Ignored (Inspect) Visit Preview Jun 18, 2024 7:05pm
sui-typescript-docs ⬜️ Ignored (Inspect) Visit Preview Jun 18, 2024 7:05pm

@wlmyng wlmyng force-pushed the wlmyng/indexer/scan-limit-tx-approach branch from 33a826e to c373fad Compare June 12, 2024 20:45
@wlmyng wlmyng changed the base branch from main to wlmyng/indexer/scan-limit-tx-approach-update-indexing June 13, 2024 20:02
@wlmyng wlmyng force-pushed the wlmyng/indexer/scan-limit-tx-approach branch from c0a711c to 8666105 Compare June 13, 2024 20:03
@wlmyng wlmyng force-pushed the wlmyng/indexer/scan-limit-tx-approach-update-indexing branch 2 times, most recently from fdb9a42 to 537c908 Compare June 13, 2024 21:10
@wlmyng wlmyng force-pushed the wlmyng/indexer/scan-limit-tx-approach branch from 89172bd to 734e467 Compare June 13, 2024 21:17
@wlmyng wlmyng changed the base branch from wlmyng/indexer/scan-limit-tx-approach-update-indexing to wlmyng/indexer/scan-limit-tx-approach-add-cp-tx-table June 13, 2024 21:17
@wlmyng wlmyng changed the title Wlmyng/indexer/scan limit tx approach [4/n][gql-performance] Indexer handle different partitioning strategies Jun 13, 2024
@wlmyng wlmyng marked this pull request as ready for review June 13, 2024 21:35
@wlmyng wlmyng force-pushed the wlmyng/indexer/scan-limit-tx-approach branch from fe0f0ba to cd3b53c Compare June 13, 2024 21:55
@wlmyng wlmyng changed the title [4/n][gql-performance] Indexer handle different partitioning strategies [4/n][gql-performance] Indexer handles different partitioning strategies Jun 13, 2024
@wlmyng wlmyng requested a review from gegaowp June 13, 2024 22:29
Copy link
Contributor

@gegaowp gegaowp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good

@wlmyng wlmyng force-pushed the wlmyng/indexer/scan-limit-tx-approach-add-cp-tx-table branch 2 times, most recently from c6644a0 to 2c93f86 Compare June 14, 2024 18:20
@wlmyng wlmyng force-pushed the wlmyng/indexer/scan-limit-tx-approach branch from cd3b53c to ca5f573 Compare June 14, 2024 18:24
@wlmyng wlmyng force-pushed the wlmyng/indexer/scan-limit-tx-approach-add-cp-tx-table branch from 291bdd8 to 0de380b Compare June 17, 2024 23:04
@wlmyng wlmyng force-pushed the wlmyng/indexer/scan-limit-tx-approach branch from 1bb3397 to 0950b69 Compare June 17, 2024 23:10
@wlmyng wlmyng force-pushed the wlmyng/indexer/scan-limit-tx-approach branch from 0950b69 to ad3c136 Compare June 17, 2024 23:20
wlmyng added a commit that referenced this pull request Jun 18, 2024
…iles for pg-backed indexer per transactions benchmark findings (#18231)

## Description 

Apply findings from latest transactions benchmarking. Instead of keeping
indexes on `transactions`, we offload all filtering work to the lookup
tables. Additionally, we add `tx_calls_pkg` and `tx_calls_mod` and
dedicate `tx_calls_fun` to solely handle filters on `package`, `package`
and `module`, and `package`, `module`, and `func` respectively. This is
because it's possible to invoke the same package, mod, and even func in
a transaction.

Finally, the new setup drops `cp_sequence_number` from the lookup tables
and adds `sender` to them to support `sender` + specialized filter. The
Rust structs and code to index data to the lookup tables are also
updated.

The full stack of changes are listed below, but each can stand
independently, and will be individually merged into
`idx-breaking-change-park`
1. #18224 - handling tables
partitioned on epoch's worth of tx_sequence_number
2. #18244 - storing a mapping of
checkpoint -> tx_sequence_number range

## Test plan 

---

## Release notes

Check each box that your changes affect. If none of the boxes relate to
your changes, release notes aren't required.

For each box you select, include information after the relevant heading
that describes the impact of your changes that a user might notice and
any actions they must take to implement updates.

- [ ] Protocol: 
- [ ] Nodes (Validators and Full nodes): 
- [ ] Indexer: 
- [ ] JSON-RPC: 
- [ ] GraphQL: 
- [ ] CLI: 
- [ ] Rust SDK:
@wlmyng wlmyng force-pushed the wlmyng/indexer/scan-limit-tx-approach branch from ad3c136 to 885d8e8 Compare June 18, 2024 00:18
@wlmyng wlmyng changed the base branch from wlmyng/indexer/scan-limit-tx-approach-add-cp-tx-table to idx-breaking-change-park June 18, 2024 00:18
by exposing an additional parameter network_total_transactions, we can derive the tx_sequence_number range of an epoch without additional fetches. however, this can be done more cleanly if we can decouple watermarking from checkpoint persisting
@wlmyng wlmyng force-pushed the wlmyng/indexer/scan-limit-tx-approach branch from 885d8e8 to b8b0fc0 Compare June 18, 2024 00:22
@wlmyng wlmyng changed the title [4/n][gql-performance] Indexer handles different partitioning strategies [4/n][gql-performance] Indexer handles partitioning per epoch's worth of tx_sequence_number Jun 18, 2024
…- 2 from the db on the indexing side as we advance epoch from epoch X - 1 to X, we may not have committed this data of X - 2 on the committer side. This is unlikely to happen in prod because it requires the lag between indexing and committing to be >= one epoch for the issue to manifest. And we never caught this issue before in testing because we never had to use this network total txn data before.
… into the transaction block instead of making a separate roundtrip
@wlmyng wlmyng force-pushed the wlmyng/indexer/scan-limit-tx-approach branch from 833eb0d to b371df2 Compare June 18, 2024 19:04
@wlmyng wlmyng merged commit aef39fa into idx-breaking-change-park Jun 18, 2024
44 of 46 checks passed
@wlmyng wlmyng deleted the wlmyng/indexer/scan-limit-tx-approach branch June 18, 2024 19:43
emmazzz pushed a commit that referenced this pull request Jun 20, 2024
…iles for pg-backed indexer per transactions benchmark findings (#18231)

Apply findings from latest transactions benchmarking. Instead of keeping
indexes on `transactions`, we offload all filtering work to the lookup
tables. Additionally, we add `tx_calls_pkg` and `tx_calls_mod` and
dedicate `tx_calls_fun` to solely handle filters on `package`, `package`
and `module`, and `package`, `module`, and `func` respectively. This is
because it's possible to invoke the same package, mod, and even func in
a transaction.

Finally, the new setup drops `cp_sequence_number` from the lookup tables
and adds `sender` to them to support `sender` + specialized filter. The
Rust structs and code to index data to the lookup tables are also
updated.

The full stack of changes are listed below, but each can stand
independently, and will be individually merged into
`idx-breaking-change-park`
1. #18224 - handling tables
partitioned on epoch's worth of tx_sequence_number
2. #18244 - storing a mapping of
checkpoint -> tx_sequence_number range

---

Check each box that your changes affect. If none of the boxes relate to
your changes, release notes aren't required.

For each box you select, include information after the relevant heading
that describes the impact of your changes that a user might notice and
any actions they must take to implement updates.

- [ ] Protocol:
- [ ] Nodes (Validators and Full nodes):
- [ ] Indexer:
- [ ] JSON-RPC:
- [ ] GraphQL:
- [ ] CLI:
- [ ] Rust SDK:
emmazzz pushed a commit that referenced this pull request Jun 20, 2024
… of tx_sequence_number (#18224)

## Description 

To partition an epoch per `tx_sequence_number`, the last
`tx_sequence_number` of an epoch is the `network_total_transactions`
from the last checkpoint in the current epoch. The first
`tx_sequence_number` is then the `network_total_transactions` from the
last checkpoint in the previous epoch + 1, or the
`network_total_transactions` of the current epoch -
`epoch_total_transactions` of the current epoch.

Since we have `network_total_transactions` from the checkpoint to
commit, and `epoch_total_transactions` since we fetch current epoch from
db, we can derive the range without additional db fetches.

However, this approach uncovered a bug of sorts: when we read the
network total txn of epoch X - 2 from the db on the indexing side, we
`select max(network_total_transactions) from checkpoints where epoch =
{epoch}`. However, we may not have actually committed epoch X-2's data
on the committer side. This is unlikely to happen in prod because it
requires the lag between indexing and committing to be >= one epoch.
Since an epoch today consists of ~80k checkpoints, it is quite
improbably for this to occur. However, in many of our tests it's
possible to have as little as one checkpoint and/ or transaction between
epochs, which brought this race condition to light.

To resolve, we can place the responsibility on `persist_epoch`. When we
get to a point to persist epoch X - 1, epoch X - 2's data must have been
indexed to db.

## Test plan 

Existing tests

---

## Release notes

Check each box that your changes affect. If none of the boxes relate to
your changes, release notes aren't required.

For each box you select, include information after the relevant heading
that describes the impact of your changes that a user might notice and
any actions they must take to implement updates.

- [ ] Protocol: 
- [ ] Nodes (Validators and Full nodes): 
- [ ] Indexer: 
- [ ] JSON-RPC: 
- [ ] GraphQL: 
- [ ] CLI: 
- [ ] Rust SDK:
emmazzz pushed a commit that referenced this pull request Jun 20, 2024
…iles for pg-backed indexer per transactions benchmark findings (#18231)

Apply findings from latest transactions benchmarking. Instead of keeping
indexes on `transactions`, we offload all filtering work to the lookup
tables. Additionally, we add `tx_calls_pkg` and `tx_calls_mod` and
dedicate `tx_calls_fun` to solely handle filters on `package`, `package`
and `module`, and `package`, `module`, and `func` respectively. This is
because it's possible to invoke the same package, mod, and even func in
a transaction.

Finally, the new setup drops `cp_sequence_number` from the lookup tables
and adds `sender` to them to support `sender` + specialized filter. The
Rust structs and code to index data to the lookup tables are also
updated.

The full stack of changes are listed below, but each can stand
independently, and will be individually merged into
`idx-breaking-change-park`
1. #18224 - handling tables
partitioned on epoch's worth of tx_sequence_number
2. #18244 - storing a mapping of
checkpoint -> tx_sequence_number range

---

Check each box that your changes affect. If none of the boxes relate to
your changes, release notes aren't required.

For each box you select, include information after the relevant heading
that describes the impact of your changes that a user might notice and
any actions they must take to implement updates.

- [ ] Protocol:
- [ ] Nodes (Validators and Full nodes):
- [ ] Indexer:
- [ ] JSON-RPC:
- [ ] GraphQL:
- [ ] CLI:
- [ ] Rust SDK:
emmazzz pushed a commit that referenced this pull request Jun 22, 2024
…iles for pg-backed indexer per transactions benchmark findings (#18231)

Apply findings from latest transactions benchmarking. Instead of keeping
indexes on `transactions`, we offload all filtering work to the lookup
tables. Additionally, we add `tx_calls_pkg` and `tx_calls_mod` and
dedicate `tx_calls_fun` to solely handle filters on `package`, `package`
and `module`, and `package`, `module`, and `func` respectively. This is
because it's possible to invoke the same package, mod, and even func in
a transaction.

Finally, the new setup drops `cp_sequence_number` from the lookup tables
and adds `sender` to them to support `sender` + specialized filter. The
Rust structs and code to index data to the lookup tables are also
updated.

The full stack of changes are listed below, but each can stand
independently, and will be individually merged into
`idx-breaking-change-park`
1. #18224 - handling tables
partitioned on epoch's worth of tx_sequence_number
2. #18244 - storing a mapping of
checkpoint -> tx_sequence_number range

---

Check each box that your changes affect. If none of the boxes relate to
your changes, release notes aren't required.

For each box you select, include information after the relevant heading
that describes the impact of your changes that a user might notice and
any actions they must take to implement updates.

- [ ] Protocol:
- [ ] Nodes (Validators and Full nodes):
- [ ] Indexer:
- [ ] JSON-RPC:
- [ ] GraphQL:
- [ ] CLI:
- [ ] Rust SDK:
emmazzz pushed a commit that referenced this pull request Aug 3, 2024
…iles for pg-backed indexer per transactions benchmark findings (#18231)

Apply findings from latest transactions benchmarking. Instead of keeping
indexes on `transactions`, we offload all filtering work to the lookup
tables. Additionally, we add `tx_calls_pkg` and `tx_calls_mod` and
dedicate `tx_calls_fun` to solely handle filters on `package`, `package`
and `module`, and `package`, `module`, and `func` respectively. This is
because it's possible to invoke the same package, mod, and even func in
a transaction.

Finally, the new setup drops `cp_sequence_number` from the lookup tables
and adds `sender` to them to support `sender` + specialized filter. The
Rust structs and code to index data to the lookup tables are also
updated.

The full stack of changes are listed below, but each can stand
independently, and will be individually merged into
`idx-breaking-change-park`
1. #18224 - handling tables
partitioned on epoch's worth of tx_sequence_number
2. #18244 - storing a mapping of
checkpoint -> tx_sequence_number range

---

Check each box that your changes affect. If none of the boxes relate to
your changes, release notes aren't required.

For each box you select, include information after the relevant heading
that describes the impact of your changes that a user might notice and
any actions they must take to implement updates.

- [ ] Protocol:
- [ ] Nodes (Validators and Full nodes):
- [ ] Indexer:
- [ ] JSON-RPC:
- [ ] GraphQL:
- [ ] CLI:
- [ ] Rust SDK:
emmazzz pushed a commit that referenced this pull request Aug 3, 2024
… of tx_sequence_number (#18224)

To partition an epoch per `tx_sequence_number`, the last
`tx_sequence_number` of an epoch is the `network_total_transactions`
from the last checkpoint in the current epoch. The first
`tx_sequence_number` is then the `network_total_transactions` from the
last checkpoint in the previous epoch + 1, or the
`network_total_transactions` of the current epoch -
`epoch_total_transactions` of the current epoch.

Since we have `network_total_transactions` from the checkpoint to
commit, and `epoch_total_transactions` since we fetch current epoch from
db, we can derive the range without additional db fetches.

However, this approach uncovered a bug of sorts: when we read the
network total txn of epoch X - 2 from the db on the indexing side, we
`select max(network_total_transactions) from checkpoints where epoch =
{epoch}`. However, we may not have actually committed epoch X-2's data
on the committer side. This is unlikely to happen in prod because it
requires the lag between indexing and committing to be >= one epoch.
Since an epoch today consists of ~80k checkpoints, it is quite
improbably for this to occur. However, in many of our tests it's
possible to have as little as one checkpoint and/ or transaction between
epochs, which brought this race condition to light.

To resolve, we can place the responsibility on `persist_epoch`. When we
get to a point to persist epoch X - 1, epoch X - 2's data must have been
indexed to db.

Existing tests

---

Check each box that your changes affect. If none of the boxes relate to
your changes, release notes aren't required.

For each box you select, include information after the relevant heading
that describes the impact of your changes that a user might notice and
any actions they must take to implement updates.

- [ ] Protocol:
- [ ] Nodes (Validators and Full nodes):
- [ ] Indexer:
- [ ] JSON-RPC:
- [ ] GraphQL:
- [ ] CLI:
- [ ] Rust SDK:
wlmyng added a commit that referenced this pull request Aug 15, 2024
…iles for pg-backed indexer per transactions benchmark findings (#18231)

Apply findings from latest transactions benchmarking. Instead of keeping
indexes on `transactions`, we offload all filtering work to the lookup
tables. Additionally, we add `tx_calls_pkg` and `tx_calls_mod` and
dedicate `tx_calls_fun` to solely handle filters on `package`, `package`
and `module`, and `package`, `module`, and `func` respectively. This is
because it's possible to invoke the same package, mod, and even func in
a transaction.

Finally, the new setup drops `cp_sequence_number` from the lookup tables
and adds `sender` to them to support `sender` + specialized filter. The
Rust structs and code to index data to the lookup tables are also
updated.

The full stack of changes are listed below, but each can stand
independently, and will be individually merged into
`idx-breaking-change-park`
1. #18224 - handling tables
partitioned on epoch's worth of tx_sequence_number
2. #18244 - storing a mapping of
checkpoint -> tx_sequence_number range

---

Check each box that your changes affect. If none of the boxes relate to
your changes, release notes aren't required.

For each box you select, include information after the relevant heading
that describes the impact of your changes that a user might notice and
any actions they must take to implement updates.

- [ ] Protocol:
- [ ] Nodes (Validators and Full nodes):
- [ ] Indexer:
- [ ] JSON-RPC:
- [ ] GraphQL:
- [ ] CLI:
- [ ] Rust SDK:
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants