Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add shouldReturnRawInputStream option to Get requests #872

Merged
merged 2 commits into from
Jun 14, 2021

Conversation

JesseLovelace
Copy link
Contributor

Adds support for shouldReturnRawInputStream, which will allow users to specify whether the client should auto-decompress gzipped content or not.

Note: read(StorageObject, Map<Option, ?>, long, OutputStream) and read(StorageObject, Map<Option, ?>, long, int) in HttpStorageRpc are set up to have different defaults to preserve their original behavior. Before, returnRawInputStream was always set to true, however read(StorageObject, Map<Option, ?>, long, OutputStream) had a bug that would cause that to be lost, so it would always turn out to be false (this is why blob.downloadTo and BlobReadChannel.read had different behaviors). That bug is fixed here, but these method signatures now have different defaults to preserve the original behavior. Users can use the new flag to change that behavior

Fixes #321

@JesseLovelace JesseLovelace requested review from frankyn and a team June 9, 2021 18:45
@google-cla google-cla bot added the cla: yes This human has signed the Contributor License Agreement. label Jun 9, 2021
@product-auto-label product-auto-label bot added the api: storage Issues related to the googleapis/java-storage API. label Jun 9, 2021
if (shouldReturnRawInputStream != null ) {
req.setReturnRawInputStream(shouldReturnRawInputStream);
} else {
req.setReturnRawInputStream(false);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Default value is different here so wondering how this will be reconciled.

@snippet-bot
Copy link

snippet-bot bot commented Jun 11, 2021

No region tags are edited in this PR.

This comment is generated by snippet-bot.
If you find problems with this result, please file an issue at:
https://github.com/googleapis/repo-automation-bots/issues.
To update this comment, add snippet-bot:force-run label or use the checkbox below:

  • Refresh this comment

@JesseLovelace JesseLovelace merged commit 474dfae into master Jun 14, 2021
@JesseLovelace JesseLovelace deleted the dcflag branch June 14, 2021 18:41
@oripwk
Copy link

oripwk commented Jun 15, 2021

@JesseLovelace Does it work also with getContent() or only with downloadTo()?

@JesseLovelace
Copy link
Contributor Author

@oripwk Should work with getContent, use getContent(Blob.BlobSourceOption.shouldReturnRawInputStream(true));

@oripwk
Copy link

oripwk commented Jun 16, 2021

@JesseLovelace when is it going to be released?

@oripwk
Copy link

oripwk commented Jun 16, 2021

Good, I see it's on 1.116.0

@marioaae
Copy link

marioaae commented Oct 20, 2021

Hi there, thanks for fixing this, unfortunately I haven't been able to make it work on my end. In Scala I'm trying to read files through a GZIPInputStream as follows:

val bucketName = "my-bucket-name"
val filePath = "path/to/file.gz"
val storage: Storage = StorageOptions.getDefaultInstance.getService

val blob: Blob = storage.get(BlobId.of(bucketName, fullPath), Storage.BlobGetOption.shouldReturnRawInputStream(true))
val reader: ReadChannel = blob.reader()
val inputStream: InputStream = Channels.newInputStream(reader)

Afterwards I'm wrapping it up in a GZIPInputStream to process it as follows: new GZIPInputStream(inputStream) but I get the following error:

java.util.zip.ZipException: Not in GZIP format

Still, if I only use the InputStream it works, so it seems to me that the file is being downloaded decompressed, I thought the shouldReturnRawInputStream(true) would download my file compressed in the original .gz format.

When I download the file from a different source (e.g. an SFTP server) where I know for sure no decompression is going on, wrapping it up in GZIPInputStream works properly.

Has anybody had the same problem?

Thanks!

@frankyn
Copy link
Member

frankyn commented Oct 20, 2021

Hi @marioaae,

Could you clarify the state of your object metadata? In certain cases GCS will decompress an object on response when contentEncoding: gzip is set in object metadata.

@marioaae
Copy link

Hi @marioaae,

Could you clarify the state of your object metadata? In certain cases GCS will decompress an object on response when contentEncoding: gzip is set in object metadata.

Hi @frankyn,
I won't be able to control the metadata of that object, but in any case, when checking at the Google Console (UI) I can see that Content-Encoding is blank.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: storage Issues related to the googleapis/java-storage API. cla: yes This human has signed the Contributor License Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support reading gzip files as-is with google-cloud-storage?
4 participants