Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatically refresh materialized views #4461

Merged
merged 2 commits into from
Mar 16, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 5 additions & 2 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,9 @@

## Unreleased

- the materialized views in the `info` schema (`table_sizes`, `subgraph_sizes`, and `chain_sizes`) that provide information about the size of various database objects are now automatically refreshed every 6 hours. [#4461](https://github.com/graphprotocol/graph-node/pull/4461)


## v0.30.0

### Database locale change
Expand Down Expand Up @@ -87,15 +90,15 @@ Dependency upgrades:

- `Qmccst5mbV5a6vT6VvJMLPKMAA1VRgT6NGbxkLL8eDRsE7`
- `Qmd9nZKCH8UZU1pBzk7G8ECJr3jX3a2vAf3vowuTwFvrQg`

Here's an example [manifest](https://ipfs.io/ipfs/Qmd9nZKCH8UZU1pBzk7G8ECJr3jX3a2vAf3vowuTwFvrQg), taking a look at the data sources of name `ERC721` and `CryptoKitties`, both listen to the `Transfer(...)` event. Considering a block where there's only one occurence of this event, `graph-node` would duplicate it and call `handleTransfer` twice. Now this is fixed and it will be called only once per event/call that happened on chain.

In the case you're indexing one of the impacted subgraphs, you should first upgrade the `graph-node` version, then rewind the affected subgraphs to the smallest `startBlock` of their subgraph manifest. To achieve that the `graphman rewind` CLI command can be used.

See [#4055](https://github.com/graphprotocol/graph-node/pull/4055) for more information.

* This release fixes another determinism bug that affects a handful of subgraphs. The bug affects all subgraphs which have an `apiVersion` **older than** 0.0.5 using call handlers. While call handlers prior to 0.0.5 should be triggered by both failed and successful transactions, in some cases failed transactions would not trigger the handlers. This resulted in nondeterministic behavior. With this version of `graph-node`, call handlers with an `apiVersion` older than 0.0.5 will always be triggered by both successful and failed transactions. Behavior for `apiVersion` 0.0.5 onward is not affected.

The affected subgraphs are:

- `QmNY7gDNXHECV8SXoEY7hbfg4BX1aDMxTBDiFuG4huaSGA`
Expand Down
29 changes: 29 additions & 0 deletions store/postgres/src/deployment_store.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1658,6 +1658,35 @@ impl DeploymentStore {
});
}

pub(crate) async fn refresh_materialized_views(&self, logger: &Logger) {
async fn run(store: &DeploymentStore) -> Result<(), StoreError> {
// We hardcode our materialized views, but could also use
// pg_matviews to list all of them, though that might inadvertently
// refresh materialized views that operators created themselves
const VIEWS: [&str; 3] = [
"info.table_sizes",
"info.subgraph_sizes",
"info.chain_sizes",
];
store
.with_conn(|conn, cancel| {
for view in VIEWS {
let query = format!("refresh materialized view {}", view);
diesel::sql_query(&query).execute(conn)?;
cancel.check_cancel()?;
}
Ok(())
})
.await
}

run(self).await.unwrap_or_else(|e| {
warn!(logger, "Refreshing materialized views failed. We will try again in a few hours";
"error" => e.to_string(),
"shard" => self.pool.shard.as_str())
});
}

pub(crate) async fn health(
&self,
site: &Site,
Expand Down
39 changes: 34 additions & 5 deletions store/postgres/src/jobs.rs
Original file line number Diff line number Diff line change
Expand Up @@ -20,26 +20,34 @@ pub fn register(
primary_pool: ConnectionPool,
registry: Arc<dyn MetricsRegistryTrait>,
) {
const ONE_MINUTE: Duration = Duration::from_secs(60);
const ONE_HOUR: Duration = Duration::from_secs(60 * 60);

runner.register(
Arc::new(VacuumDeploymentsJob::new(store.subgraph_store())),
Duration::from_secs(60),
ONE_MINUTE,
);

runner.register(
Arc::new(NotificationQueueUsage::new(primary_pool, registry)),
Duration::from_secs(60),
ONE_MINUTE,
);

runner.register(
Arc::new(MirrorPrimary::new(store.subgraph_store())),
Duration::from_secs(15 * 60),
15 * ONE_MINUTE,
);

// Remove unused deployments every 2 hours
runner.register(
Arc::new(UnusedJob::new(store.subgraph_store())),
Duration::from_secs(2 * 60 * 60),
)
2 * ONE_HOUR,
);

runner.register(
Arc::new(RefreshMaterializedView::new(store.subgraph_store())),
6 * ONE_HOUR,
);
}

/// A job that vacuums `subgraphs.subgraph_deployment`. With a large number
Expand Down Expand Up @@ -149,6 +157,27 @@ impl Job for MirrorPrimary {
}
}

struct RefreshMaterializedView {
store: Arc<SubgraphStore>,
}

impl RefreshMaterializedView {
fn new(store: Arc<SubgraphStore>) -> Self {
Self { store }
}
}

#[async_trait]
impl Job for RefreshMaterializedView {
fn name(&self) -> &str {
"Refresh materialized views"
}

async fn run(&self, logger: &Logger) {
self.store.refresh_materialized_views(logger).await;
}
}

struct UnusedJob {
store: Arc<SubgraphStore>,
}
Expand Down
9 changes: 9 additions & 0 deletions store/postgres/src/subgraph_store.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1029,6 +1029,15 @@ impl SubgraphStoreInner {
.await;
}

pub async fn refresh_materialized_views(&self, logger: &Logger) {
join_all(
self.stores
.values()
.map(|store| store.refresh_materialized_views(logger)),
)
.await;
}

pub fn analyze(
&self,
deployment: &DeploymentLocator,
Expand Down