Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ignore nested delta directories for Hive ACID #20840

Merged
merged 1 commit into from
Mar 6, 2024

Conversation

pajaks
Copy link
Member

@pajaks pajaks commented Feb 26, 2024

Description

Fixes: #16315
Related: #8920

Multiple Hive minor compaction creates duplicated directoires like:

./delta_0000001_0000001/bucket_00000
./delta_0000001_0000001/delta_0000001_0000001
./delta_0000001_0000001/delta_0000001_0000001/bucket_00000

Hive is able to read such table without problem, ignoring nested delta directories. This change follows Hive behaviour.

Additional context and related issues

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
( ) Release notes are required, with the following suggested text:

# Section
* Fix some things. ({issue}`issuenumber`)

@cla-bot cla-bot bot added the cla-signed label Feb 26, 2024
@github-actions github-actions bot added tests:hive hive Hive connector labels Feb 26, 2024
@pajaks pajaks marked this pull request as ready for review February 26, 2024 11:52
@pajaks pajaks self-assigned this Feb 26, 2024
@findepi
Copy link
Member

findepi commented Feb 26, 2024

Fixes: #8920, #16315

you need to put "fixes" before every issue reference you want to close when PR merged

@findinpath findinpath requested a review from electrum February 26, 2024 14:13
@pajaks pajaks force-pushed the pajaks/skip_nested_dir_hive_acid branch 3 times, most recently from 8b21aba to e0271c8 Compare February 26, 2024 15:11
@pajaks pajaks force-pushed the pajaks/skip_nested_dir_hive_acid branch from e0271c8 to c1ee65a Compare February 28, 2024 10:10
Copy link
Member

@electrum electrum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change seems good, but needs a test in TestHiveAcidUtils.

This doesn't seem to fix #8920 as that is a different error message and comes from BackgroundHiveSplitLoader. Do you want to fix that as well?

Copy link
Member

@electrum electrum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs a test in TestHiveAcidUtils

@pajaks pajaks force-pushed the pajaks/skip_nested_dir_hive_acid branch from c1ee65a to b171307 Compare March 1, 2024 08:52
Implementation follows Hive behaviour
@pajaks pajaks force-pushed the pajaks/skip_nested_dir_hive_acid branch from b171307 to e819407 Compare March 1, 2024 08:55
@pajaks
Copy link
Member Author

pajaks commented Mar 1, 2024

Thank you for review @electrum

This change seems good, but needs a test in TestHiveAcidUtils.

Added.

This doesn't seem to fix #8920 as that is a different error message and comes from BackgroundHiveSplitLoader. Do you want to fix that as well?

Error message and origin for this ACID error changed since issue was created. I guess it was changed with converting AcidUtils to Trino with 540f53e commit.

@pajaks pajaks requested review from findinpath and electrum March 1, 2024 09:04
@electrum electrum merged commit 99d7cb7 into trinodb:master Mar 6, 2024
57 checks passed
@github-actions github-actions bot added this to the 440 milestone Mar 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla-signed hive Hive connector
Development

Successfully merging this pull request may close these issues.

ACID subdirectory corruption in product test
4 participants