Skip to content

Commit

Permalink
docs(graphql): explain scan limits rationale
Browse files Browse the repository at this point in the history
  • Loading branch information
amnn committed Sep 13, 2024
1 parent d16c114 commit c6f728b
Showing 1 changed file with 49 additions and 0 deletions.
49 changes: 49 additions & 0 deletions crates/sui-graphql-rpc/src/types/transaction_block/tx_lookups.rs
Original file line number Diff line number Diff line change
@@ -1,6 +1,55 @@
// Copyright (c) Mysten Labs, Inc.
// SPDX-License-Identifier: Apache-2.0

//! # Transaction Filter Lookup Tables
//!
//! ## Schemas
//!
//! Tables backing Transaction filters in GraphQL all follow the same rough shape:
//!
//! 1. They each get their own table, mapping the filter value to the transaction sequence number.
//!
//! 2. They also include a `sender` column, and a secondary index over the sender, filter values
//! and the transaction sequence number.
//!
//! 3. They also include a secondary index over the transaction sequence number.
//!
//! This pattern allows us to offer a simple rule for users: If you are filtering on a single
//! value, you can do so without worrying. If you want to additionally filter by the sender, that
//! is also possible, but if you want to combine any other set of filters, you need to use a "scan
//! limit".
//!
//! ## Query construction
//!
//! Queries that filter transactions work in two phases: Identify the transaction sequence numbers
//! to fetch, and then fetch their contents. Filtering all happens in the first phase:
//!
//! - Firstly filters are broken down into individual queries targeting the appropriate lookup
//! table. Each constituent query is expected to return a sorted run of transaction sequence
//! numbers.
//!
//! - If a `sender` filter is included, then it is incorporated into each constituent query,
//! leveraging their secondary indices (2), otherwise each constituent query filters only based on
//! its filter value using the primary index (1).
//!
//! - The fact that both the primary and secondary indices contain the transaction sequence number
//! help to ensure that the output from an index scan is already sorted, which avoids a
//! potentially expensive materialize and sort operation.
//!
//! - If there are multiple constituent queries, they are intersected using inner joins. Postgres
//! can occasionally pick a poor query plan for this merge, so we require that filters resulting in
//! such merges also use a "scan limit" (see below).
//!
//! ## Scan limits
//!
//! The scan limit restricts the number of transactions considered as candidates for the results.
//! It is analogous to the page size limit, which restricts the number of results returned to the
//! user, but it operates at the top of the funnel rather than the top.
//!
//! When postgres picks a poor query plan, it can end up performing a sequential scan over all
//! candidate transactions. By limiting the size of the candidate set, we bound the work done in
//! the worse case (whereas otherwise, the worst case would grow with the history of the chain).

use super::{Cursor, TransactionBlockFilter};
use crate::{
data::{pg::bytea_literal, Conn, DbConnection},
Expand Down

0 comments on commit c6f728b

Please sign in to comment.