Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Verify the file presense for cached directory lister and retry #20414

Merged
merged 2 commits into from
Feb 22, 2024

Conversation

i-93
Copy link
Member

@i-93 i-93 commented Jan 18, 2024

Description

These changes address the problem when the new files in manifest are not visible by directoryLister immediately because of the caching delay (see #20344).

The buildManifestFileIterator() method now verifies if the referenced file does not appear in the listing. If that is the case it tries to invalidate the cache and reload the listing.

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
(*) Release notes are required. Please propose a release note for me.
( ) Release notes are required, with the following suggested text:

Copy link

cla-bot bot commented Jan 18, 2024

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

@i-93 i-93 marked this pull request as draft January 18, 2024 17:50
@github-actions github-actions bot added tests:hive hive Hive connector labels Jan 18, 2024
Copy link

cla-bot bot commented Jan 19, 2024

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

2 similar comments
Copy link

cla-bot bot commented Jan 19, 2024

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

Copy link

cla-bot bot commented Jan 19, 2024

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

@i-93 i-93 force-pushed the cache-verification branch from 3b5f2e9 to abfa031 Compare January 19, 2024 13:07
Copy link

cla-bot bot commented Jan 19, 2024

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

@i-93 i-93 marked this pull request as ready for review January 19, 2024 13:31
@mosabua
Copy link
Member

mosabua commented Jan 22, 2024

@i-93 could you submit a signed CLA please.

@mosabua
Copy link
Member

mosabua commented Jan 22, 2024

@raunaqmorarka @electrum .. any idea who could help with review here?

Copy link

cla-bot bot commented Jan 23, 2024

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

Stream<TrinoFileStatus> fileStream = paths.stream()

// If file statuses came from cache verify that all are present
if (isCached) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't feel right to me to ask each time whether the location is cached.
You are adding handling for a corner case in the happy flow this way.

Maybe it would be better to add a procedure to clear the directory listing caching for a specified location.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code under this 'if' verifies if all the listed files are present in directory listing. We know that the discrepancy could be caused by the stale cache and in this case there is a way to handle it (invalidate the cache and retry). There is no sense to do it if location is not cached, invalidation is NoOp and retry would provide the same results.

I did add invalidate(Location) call to directory lister, so the conditional code would work in any case. It is just a performance optimization: avoiding verification and retrying if those are not going to change anything anyway.

@i-93 i-93 requested a review from findinpath January 25, 2024 15:46
@i-93
Copy link
Member Author

i-93 commented Jan 25, 2024

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

@i-93 i-93 closed this Jan 25, 2024
@i-93 i-93 reopened this Jan 25, 2024

// If file statuses came from cache verify that all are present
if (isCached) {
boolean missing = paths.stream()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than fully reloading the whole cache it'd be nice if we could just check any missing paths directly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that would be great. Unfortunately directory lister can't list the files individually, it does it by folders and caches the same way. We are invalidating the cache for a parent folder (not the whole cache!) causing the reloading of it's content.

@i-93
Copy link
Member Author

i-93 commented Jan 26, 2024

@i-93 could you submit a signed CLA please.

@mosabua I submit it 3 weeks ago. No response...

@i-93 i-93 force-pushed the cache-verification branch from 6e6aae8 to 6ee23db Compare January 27, 2024 12:20
Copy link

cla-bot bot commented Jan 27, 2024

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

@i-93 i-93 force-pushed the cache-verification branch from 6ee23db to ebfa775 Compare January 27, 2024 14:53
@cla-bot cla-bot bot added the cla-signed label Jan 27, 2024
@i-93 i-93 force-pushed the cache-verification branch from ebfa775 to 2668be9 Compare January 27, 2024 15:04
@i-93 i-93 requested a review from alexjo2144 January 27, 2024 16:14
@i-93
Copy link
Member Author

i-93 commented Jan 28, 2024

@findinpath, @alexjo2144, @electrum Does anybody have any concerns about merging this? It has been tested in our preprod environment, addressed the #20344 and didn't show any side effects.

@i-93 i-93 assigned i-93 and unassigned i-93 Jan 28, 2024
@i-93 i-93 force-pushed the cache-verification branch 2 times, most recently from 9392592 to 954817f Compare February 6, 2024 16:57
Copy link
Member

@alexjo2144 alexjo2144 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay on this. Generally this seems fine to me, one small nit pick on the interface changes.

Besides that, have you tried adding a test? There are some existing ones in the product test suite that use symlink tables.

.anyMatch(path -> !fileStatuses.containsKey(path.path()));
// Invalidate the cache and reload
if (missing) {
directoryLister.invalidate(location);
Copy link
Member

@alexjo2144 alexjo2144 Feb 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the general approach here is fine, I would suggest changing the interfaces in a slightly different way though.

Exposing isCached via TableInvalidationCallback seems fine to me.

What I'd do differently is rather than exposing invalidate(Location) can we try adding an additional parameter to the HiveFileIterator cnstr, something like boolean invalidateCaches. That will get passed through DirectoryLister#listFilesRecursively to force a hard load when set to true?

My reason for that is you're assuming here that the cache key is on a Location, but that's an internal to the caching DirectoryListers that's not guaranteed to be stable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @alexjo2144.

I looked into your proposal to remove invalidate(Location) from the interface. That doesn't look right for me. Directory listers are chained with the delegate model, some of them are caching (by Location) ones, some are not (they have a Noop invalidate). If we remove invalidate from the interface we won't be able to push it down the chain.

We only call invalidate(location) if isCached(location) is true, so that in a way verifies that the particular directory lister supports cache by location.

What do you think?

@i-93 i-93 force-pushed the cache-verification branch 3 times, most recently from 0355a13 to 74ff57f Compare February 9, 2024 19:52
@i-93
Copy link
Member Author

i-93 commented Feb 9, 2024

Sorry for the delay on this. Generally this seems fine to me, one small nit pick on the interface changes.

Besides that, have you tried adding a test? There are some existing ones in the product test suite that use symlink tables.

@alexjo2144 I have added a test case that fails on a stock code, but passes on the modified one.

@i-93 i-93 requested a review from alexjo2144 February 9, 2024 20:01
@i-93
Copy link
Member Author

i-93 commented Feb 16, 2024

@alexjo2144, @findinpath, @electrum, @raunaqmorarka
Please let me know what I should do to have it merged.
Thanks!

@i-93 i-93 force-pushed the cache-verification branch from 74ff57f to d87cf74 Compare February 21, 2024 14:36
@findepi findepi requested a review from sopel39 February 21, 2024 18:48
@electrum electrum merged commit 0324da7 into trinodb:master Feb 22, 2024
57 checks passed
@github-actions github-actions bot added this to the 440 milestone Feb 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla-signed hive Hive connector
Development

Successfully merging this pull request may close these issues.

5 participants