Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chunk size patch #3624

Conversation

Ryabina-io
Copy link

Closes #2330.
I added +2 to chunk size calculation denominator, as @tilacog advises and our subgraph continued indexing.
I don't know internals of table.columns.len() - does it counts full-text search columns for example. But with this PR I want to push progress on this issue somehow.
Also I created this PR to 0.26.0 release branch because I tested only that change over 0.26.0 version and I don't know right process.

@lutter
Copy link
Collaborator

lutter commented Jun 15, 2022

table.columns does include fulltext columns (see here) but we never insert the vid, that is determined by the database.

Are we sure that that issue is caused by the InsertQuery? Looking at Layout.update, it runs two queries: ClampRangeQuery and InsertQuery. ClampRangeQuery here always uses 2 bind variables so is not the reason for the error. InsertQuery here uses at most table.columns.len() + 1 bind variables per entity, which makes me think the math there is correct.

Since this does fail in practice, there's some mistake in that calculation, and I'd really like to know what that is. Is there any chance you can run a failing subgraph locally and have graph-node print details of the calculation here, in particular, number of entities, number of entity keys, number of columns, and chunk size?

@Ryabina-io
Copy link
Author

@lutter added logs in https://github.com/Ryabina-io/graph-node/blob/chunk_size_debug/store/postgres/src/relational.rs#L807
Logs around crashed subgraph:

Jun 23 14:46:47.355 INFO Applying 13789 entity operation(s), block_hash: 0xb9db6d9c41efce3b52d0515e0ed7b1d9d14e6378e8514241b51f44680df5eef4, block_number: 13907188, sgd: 1, subgraph_id: QmchHuvNLcLRGt4C3pLArRXtkECPPCaZqoycKvzpYagsNU, component: SubgraphInstanceManager
Inserting entities entity_type: DelegatorRewardHistoryEntity, number_of_entities: 4683,  number_of_columns: 7, chunk_size: 8191
Inserting entities entity_type: DelegationPoolHistoryEntity, number_of_entities: 2,  number_of_columns: 7, chunk_size: 8191
Inserting entities entity_type: RewardCutHistoryEntity, number_of_entities: 2,  number_of_columns: 9, chunk_size: 6553
Updating entities: GraphAccount, number_of_entities: 2, number_of_entity_keys: 2,  number_of_columns: 23, chunk_size: 2730
Updating entities: DelegatedStake, number_of_entities: 4683, number_of_entity_keys: 4683,  number_of_columns: 18, chunk_size: 3449
Updating entities: Indexer, number_of_entities: 2, number_of_entity_keys: 2,  number_of_columns: 43, chunk_size: 1489
Updating entities: Allocation, number_of_entities: 2, number_of_entity_keys: 2,  number_of_columns: 37, chunk_size: 1724
Updating entities: Poi$, number_of_entities: 1, number_of_entity_keys: 1,  number_of_columns: 2, chunk_size: 21845
Updating entities: SubgraphDeployment, number_of_entities: 2, number_of_entity_keys: 2,  number_of_columns: 28, chunk_size: 2259
Updating entities: Epoch, number_of_entities: 1, number_of_entity_keys: 1,  number_of_columns: 13, chunk_size: 4681
Updating entities: Delegator, number_of_entities: 4405, number_of_entity_keys: 4405,  number_of_columns: 14, chunk_size: 4369
Jun 23 14:46:54.474 ERRO Subgraph failed with non-deterministic error: Error while processing block stream for a subgraph: store error: number of parameters must be between 0 and 65535	, retry_delay_s: 1800, attempt: 213, sgd: 1, subgraph_id: QmchHuvNLcLRGt4C3pLArRXtkECPPCaZqoycKvzpYagsNU, component: SubgraphInstanceManager

@tilacog
Copy link
Contributor

tilacog commented Jun 23, 2022

It seems that impl QueryFragment for InsertQuery is counting the bindings correctly: one for each field plus one for the block range.

But InsertQuery delegates walk_ast to its columns via QueryValue, and I'm suspicious that this for loop in the ColumnType::TSVector might be producing extra bindings that we are not accounting for when calculating the chunk size:

ColumnType::TSVector(config) => {
if sql_values.is_empty() {
out.push_sql("''::tsvector");
} else {
out.push_sql("(");
for (i, value) in sql_values.iter().enumerate() {
if i > 0 {
out.push_sql(") || ");
}
out.push_sql("to_tsvector(");
out.push_bind_param::<Text, _>(
&config.language.as_str().to_string(),
)?;
out.push_sql("::regconfig, ");
out.push_bind_param::<Text, _>(&value)?;
}
out.push_sql("))");
}
Ok(())
}

A single iteration would produce 2 bindings.

I don't know if @Ryabina-io's subgraph even uses that field type, but it might be worth giving a shot at including those bindings in the chunk size estimate.

@lutter
Copy link
Collaborator

lutter commented Jun 23, 2022

It seems that impl QueryFragment for InsertQuery is counting the bindings correctly: one for each field plus one for the block range.

But InsertQuery delegates walk_ast to its columns via QueryValue, and I'm suspicious that this for loop in the ColumnType::TSVector might be producing extra bindings that we are not accounting for when calculating the chunk size:

...

A single iteration would produce 2 bindings.

I don't know if @Ryabina-io's subgraph even uses that field type, but it might be worth giving a shot at including those bindings in the chunk size estimate.

Nice find. Yes, that would definitely do it since it will use 2 bindings for each term in the TSVector. That would affect only subgraphs that use fulltext search.

@github-actions
Copy link

This pull request hasn't had any activity for the last 90 days. If there's no more activity over the course of the next 14 days, it will automatically be closed.

@github-actions github-actions bot added the Stale label Nov 17, 2022
@github-actions github-actions bot closed this Dec 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants