Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Table change check #80

Merged
merged 11 commits into from
Jun 12, 2024
Merged

Table change check #80

merged 11 commits into from
Jun 12, 2024

Conversation

twuebi
Copy link
Contributor

@twuebi twuebi commented Jun 11, 2024

#58

@twuebi twuebi changed the title wip: change check Table change check Jun 11, 2024
@twuebi twuebi requested a review from c-thiel June 11, 2024 13:50
.v1_state
.contract_verifiers
.check(&update, *table_uuid, &response.previous_table_metadata)
.await?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There might be quite a few updates here - I would prefer concurrent execution and try_join

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't quite fit here, but I am also a bit hesitant about awaiting the emit_change_event block.
We currently implement a "send and forget" approach. But right now, if the event queue is offline, we are blocking the client response until the event queue gets a timeout.
What's your feeling about emitting events non-blocking?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we expect this endpoint to be called a lot? How many changes do we expect per call?

We could separate it out via a channel and have some background task shoveling it into nats.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There might be quite a few updates here - I would prefer concurrent execution and try_join

made it futures + try_join_all

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we expect this endpoint to be called a lot? How many changes do we expect per call?

We could separate it out via a channel and have some background task shoveling it into nats.

#86

@@ -819,6 +842,19 @@ impl<C: Catalog, A: AuthZHandler, S: SecretStore>
));
}

for ((update, (_, table_uuid)), response) in updates
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should push the whole block up to right after C::commit_table_transaction.
There is no need to load storage secret if the contract check fails.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -929,3 +965,65 @@ fn maybe_body_to_json(request: impl Serialize) -> serde_json::Value {
serde_json::Value::Null
}
}

// TableUpdate is not clone but all containing fields are so we use this function to clone..
fn copy_updates(updates: &[TableUpdate]) -> Vec<TableUpdate> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we maybe rename to clone_updates?

apache/iceberg-rust#402

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

async fn check(
&self,
table_updates: &[TableUpdate],
table_ident_uuid: TableIdentUuid,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that we send the current_metadata, we might want to skip the table_ident_uuid.
The table-uuid is already part of TableMetadata and might change during updates.
If we specify it here additionally, we should rename it to current_table_ident_uuid to make sure it is the UUID before updates.
I think removing it is fine though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, missed it in table-metadata, removed!

Ok(ContractVerificationOutcome::Clear {}) => {}
Ok(block_result @ ContractVerificationOutcome::Violation { error_model: _ }) => {
tracing::info!(
"Checker {} blocked change on table '{}'",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets reflect the trait in the log - maybe ContractValidator {} blocked change on table ..
Also for other logs here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@c-thiel c-thiel merged commit 65a5f9f into main Jun 12, 2024
6 checks passed
@c-thiel c-thiel deleted the tp/change-check-wip branch June 12, 2024 08:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants