Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(s3): restore working test for DynamoDb log store repair log on read #2120

Merged
merged 13 commits into from
Jan 30, 2024

Conversation

dispanser
Copy link
Contributor

Description

Make sure the read path for delta table commit entries passes through the log store, enabling it to ensure the invariants and potentially repair a broken commit in the context of S3 / DynamoDb log store implementation.

This also adds another test in the context of S3 log store: repairing a log store on load was not implemented previously.

Note that this a stopgap and not a complete solution: it comes with a performance penalty as we're triggering a redundant object store list operation just for the purpose of "triggering" the log store functionality.

fixes #2109

@github-actions github-actions bot added binding/rust Issues for the Rust crate crate/core labels Jan 26, 2024
@ion-elgreco
Copy link
Collaborator

@dispanser how about the python S3 round trip test?

@dispanser
Copy link
Contributor Author

@dispanser how about the python S3 round trip test?

This test isn't related to the changes in this PR. @roeap , is this some known regression related to your changes?

@dispanser
Copy link
Contributor Author

For this test, test_roundtrip_s3_env, there's no dynamodb-based locking configured, so it falls back to the DefaultLogStore implementation. Here's what happens:

first write_deltalake(..) called by the test:

  1. commit_with_retries prepares a temporary commit entry (json file)
  2. S3StorageBackend::rename_if_not_exists is called to rename the commit into its final destination, 0.json
  3. since self.allow_unsafe_rename is not set, it throws an error as there's no safe way to atomically rename

second write_deltalake(..) called by the test, after config change to inject setenv("AWS_S3_ALLOW_UNSAFE_RENAME", "true"):

  1. verify that the table doesn't yet exist. This fails, because apparently our condition to see if the table already exists is to check that table_root is not empty!

The underlying problem is that we detect the attempt to write to a table that can't be written to in a very late stage of writing. As a result, there's already some files (both parquet files and a temporary commit entry) present, which makes the next incarnation of delta lake think that the table already exists.

Both commit_with_retries and DefaultLogStore are entirely unaware that the requested write won't work, so the json commit entries are there, and in any case, parquet files are also present. I wonder how that test ever worked, because even if detecting the problem earlier, at least the parquet files will already be present. Also, there's not really any harm done - some files are present, but since there's no proper commit entry, everything is ignored.

Potential fixes:

  1. a better way to test if a table already exists
  2. have some method LogStore::can_write() or similar that signals if a write is even possible. In a different universe, we might have separate traits for writeable and non-writeable tables and constructing a writeable table would fail long before a single byte enters any parquet file
  3. ignore the test, or make it less picky :-).

I tend to think that it's valuable to avoid accidental overwriting of an existing table, and I do think our existing method to detect the presence of an existing table is too limiting - so in my opinion, option 1 seems most sensible to me.

Opionions? @ion-elgreco @rtyler @roeap ?

@roeap
Copy link
Collaborator

roeap commented Jan 26, 2024

This test isn't related to the changes in this PR. @roeap , is this some known regression related to your changes?

Yes, that failure was introduced, when we moved to the arrow backend.

Also thanks for digging into it and figuring out why the round-trip test fails. I agree with your assessment, that a better way to see if we have a table is the way to go. The current approach is to check if there is a _delta_log folder (albeit in a very inefficient way via a list due to some quirks we had ion windows machines that prevented a simple head). This check would indeed consider a location with a failed commit 0 as an existing table location. IIRC, sprak also just checks if _delta_log exists.

Maybe we could just try to create a new LogSegment and see if that is non-empty, as it already includes checks for valid files?

roeap
roeap previously approved these changes Jan 26, 2024
Copy link
Collaborator

@roeap roeap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this @dispanser - this brings us much closer to a releasable state again 👍.

) -> DeltaResult<Self> {
debug!(
"try_new_slice: start_version: {}, end_version: {:?}",
start_version, end_version
);
let max_version_2 = log_store.get_latest_version(start_version).await?;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we add a TODO here o.a. saying why this statement exists and the we want to get rid of it?

@roeap
Copy link
Collaborator

roeap commented Jan 26, 2024

Note that this a stopgap and not a complete solution: it comes with a performance penalty as we're triggering a redundant object store list operation just for the purpose of "triggering" the log store functionality.

For most of our users this should not be too bad, but for some tables this can be very expensive. Particularly if you have streaming scenarios we sometimes see A LOT of versions ...

Ultimately I hope that we can converge on some consolidated storage trait, but I hope that we can find out wht the write abstractions will look like in a post-kernel world.

@roeap
Copy link
Collaborator

roeap commented Jan 26, 2024

@dispanser, when looking into this, I though that updating this line

async move { store.get(&meta.location).await?.bytes().await }

to use LogStore would resolve the issue. Would that have worked as well and would that avoid the extra list?

roeap added a commit that referenced this pull request Jan 26, 2024
# Description

Right now we have some
[issue](#2120 (comment)
how we identify if a location is a delta-table. This disables an
affected test so we can merge PRs again without having to ignore
requires CI runs.
@dispanser
Copy link
Contributor Author

@dispanser, when looking into this, I though that updating this line

async move { store.get(&meta.location).await?.bytes().await }

to use LogStore would resolve the issue. Would that have worked as well and would that avoid the extra list?

Yeah, that would also work. LogStore::read_commit_entry(..) would touch DynamoDb and then retrieve the bytes of the commit using the underlying object store. However, this also means that for each individual version we retrieve, DynamoDb interaction is necessary. This is wasteful and not useful, especially if we are going to read multiple versions consecutively.

To avoid this repeated interaction with DynamoDb, it would be good to replace read_commit_entry(version) with something that accepts a version range instead. This is sort of what LogSegment::try_new_slice(..) does but at a lower level. Or maybe a log-aware replacement for list_log_files(..), that should be all we need.

@rtyler rtyler enabled auto-merge (rebase) January 28, 2024 04:19
@roeap roeap disabled auto-merge January 28, 2024 04:31
@roeap
Copy link
Collaborator

roeap commented Jan 28, 2024

@dispanser and i discussed we'll wait and explore an alternative before merging this one ...

Copy link
Member

@rtyler rtyler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I interpret @roeap 's last comment to mean that this is no longer approved. Please dismiss this review when it's mergeable so nobody hits the tempting green Merge button 😄

@rtyler rtyler force-pushed the s3-dynamodb-fix-repair-on-update branch from 5f18623 to 73980cf Compare January 29, 2024 21:16
@rtyler rtyler merged commit 467afc5 into delta-io:main Jan 30, 2024
20 checks passed
rtyler added a commit to dispanser/delta-rs that referenced this pull request Jan 31, 2024
…ad (delta-io#2120)

# Description

Make sure the read path for delta table commit entries passes through
the log store, enabling it to ensure the invariants and potentially
repair a broken commit in the context of S3 / DynamoDb log store
implementation.

This also adds another test in the context of S3 log store: repairing a
log store on load was not implemented previously.
  
Note that this a stopgap and not a complete solution: it comes with a
performance penalty as we're triggering a redundant object store list
operation just for the purpose of "triggering" the log store
functionality.


fixes delta-io#2109

---------

Co-authored-by: Ion Koutsouris <15728914+ion-elgreco@users.noreply.github.com>
Co-authored-by: R. Tyler Croy <rtyler@brokenco.de>
RobinLin666 pushed a commit to RobinLin666/delta-rs that referenced this pull request Feb 2, 2024
# Description

Right now we have some
[issue](delta-io#2120 (comment)
how we identify if a location is a delta-table. This disables an
affected test so we can merge PRs again without having to ignore
requires CI runs.
RobinLin666 pushed a commit to RobinLin666/delta-rs that referenced this pull request Feb 2, 2024
…ad (delta-io#2120)

# Description

Make sure the read path for delta table commit entries passes through
the log store, enabling it to ensure the invariants and potentially
repair a broken commit in the context of S3 / DynamoDb log store
implementation.

This also adds another test in the context of S3 log store: repairing a
log store on load was not implemented previously.
  
Note that this a stopgap and not a complete solution: it comes with a
performance penalty as we're triggering a redundant object store list
operation just for the purpose of "triggering" the log store
functionality.


fixes delta-io#2109

---------

Co-authored-by: Ion Koutsouris <15728914+ion-elgreco@users.noreply.github.com>
Co-authored-by: R. Tyler Croy <rtyler@brokenco.de>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
binding/rust Issues for the Rust crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

test_repair_on_update broken in main
4 participants