Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tar reader stuck with certain files and chunk sizes #2698

Merged
merged 2 commits into from
Jul 2, 2021

Conversation

johanandren
Copy link
Member

Cannot share repeating tar files (from customer) and did not figure out how to create a repeating file safe to share, but when reading certain tar files the stage would get stuck waiting for more data to be pushed but without having pulled upstream to actually get the next chunk.

References #2697

Cannot share repeating tar files but when reading certain tar files the stage would get stuck waiting
for more data to be pushed but without having pulled upstream to actually get the next chunk.
@johanandren johanandren force-pushed the wip-2697-tar-reader-stuck-johanandren branch from a5471cc to 62a9d9b Compare July 2, 2021 07:28
@@ -37,12 +36,15 @@ private[file] class TarReaderStage
extends SubSourceOutlet[ByteString]("fileOut")
with TarReaderStage.SourceWithTimeout

readHeader(ByteString.empty)
setHandlers(flowIn, flowOut, new CollectHeader(ByteString.empty))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't do a pull in the constructor, only set handlers, so instead of calling readHeader we set collect handlers here directly, we anyway know the buffer is empty.


def readHeader(buffer: ByteString): Unit = {
if (buffer.length >= TarArchiveEntry.headerLength) {
readFile(buffer)
} else setHandlers(flowIn, flowOut, new CollectHeader(buffer))
} else {
if (!hasBeenPulled(flowIn)) pull(flowIn)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the problematic files/chunk size scenario we'd get here, without flowIn being pulled, and then switch to CollectHeader waiting to get the next chunk.

Copy link
Member

@ennru ennru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@johanandren johanandren merged commit 7bf4a3a into master Jul 2, 2021
@johanandren johanandren deleted the wip-2697-tar-reader-stuck-johanandren branch July 2, 2021 09:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants