-
Notifications
You must be signed in to change notification settings - Fork 369
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CONTENT_DOWNLOAD_MISMATCH with successful file download #709
Comments
Validation error because of crc32c check failure. Line 1372 in 61eeb64
Same error if validation set to md5 const file = storage.bucket("Bucket Name").file("file path on google storage");
file.download({
destination: "downloaded.jpg",
validation: 'md5'
}).then(res => {
console.log("DL OK");
}).catch(err => {
console.error(err);
}) So if gzip response header not found in download response then we should ignore crc32c check on uncompressed data. |
@taiyokato I have reproduced this when using already-compressed content, specifically a PNG. In your example, you use a JPEG, which is also already compressed. The Storage docs advise against this due to "undesired behaviors" (see Using gzip on compressed objects). In this case, the server is responding with the actual image contents without the gzip wrapper. We don't run the decompression, because it isn't compressed data we are receiving. Validation still runs, however, and that means the hashes we compute are not against the gzipped data value, but the actual data. @jiren Currently, we run validation on all downloads. Can you clarify the conditions we should bypass the validation? |
Ah, that clears the mystery. Should I keep this issue open or close it? |
Here is the current condition. Lines 1300 to 1303 in 71a4f59
This condition can be change as per data is compress or not. If the gzip header is present then only validate CRC and MD5. if (shouldRunValidation && isCompressed) {
validateStream = hashStreamValidation({ crc32c, md5 });
throughStreams.push(validateStream);
} else {
crc32c = false
} Same thing is implemented in go storage client. if length != 0 && !res.Uncompressed && !uncompressedByServer(res) { |
@AVaksman I'll hand this over to you for now, as I believe this is blocked on the outcome of your investigation with the Storage team: https://docs.google.com/document/d/1jz91hfD5AJ8ghXnTPSnCuBa4NdkOPq085y9hBBQFu8M. I'm happy to do any implementation work if necessary when the time comes. |
@AVaksman is there a game plan for how to address this? |
@AVaksman @stephenplusplus - this bug has just caused us an issue in our production environment which we have had to workaround, any eta on a fix? |
I think what we'd like to end up doing here is skipping validation if the file has the metadata We would insert that logic after the @frankyn Does this temporary fix make sense? |
That sounds like a good plan in the mean time. Please move forward it and add a note that this is only temporary. |
Opening this so we track the other part of the fix. Will mark as external as it won't likely happen in this repo. |
Any eta on a fix for this? Upgraded to 4.3.1 but see the same issue. |
We believe the issue should have been fixed with 4.3.1, so that's weird that it's still showing up for you @danielwhatmuff. I'd like to get more information - Could you help us by providing:
I appreciate your patience and working with us to resolve this! |
@frankyn do you know where this issue stands? I don't think there was anything left for us to do from this library. |
I'm running into this with some uploaded .png and .jpegs. I do not GZIP on upload at all. I tried the MD5 check to see if that cleared it up, but it didn't. The odd thing is that it only breaks when opening multiple read streams for different files to zip them. When I attempt to read the one file in a separate file, it works. For the life of me, the two sets of code appear the same. |
We are still working with the backend team to resolve this issue correctly. We are working on a workaround and will post an update when available. |
Is there an update on this? The issue is affecting us. |
News leading to closure of this issue: As of end of this week Cloud Storage API will always respect I'm closing this issue as we can now safely validate checksum because GCS respects not decompressing data server side before sending back to customers. Thanks for your patience |
Spoke to soon, Rolled back the change so we will need to follow-up again when we have an update. Apologies, I jinxed it. |
Environment details
@google-cloud/storage
version: 2.5.0Steps to reproduce
code=CONTENT_DOWNLOAD_MISMATCH message=The downloaded data did not match the data from the server. To be sure the content is the same, you should download the file again.
I've read through Issue 566, but seems like not a solution.
storage.bucket("somebucket").file("somefolder/someimage.jpg").download({validation: false});
works, but there's no reason to or should disable validation.To make sure if the hashes are actually a mismatch, I ran a local md5sum check on the downloaded and original image files.
Downloaded file is a match to the original file.
BTW, This problem doesn't happen if the image is not gzipped on upload.
Thanks
The text was updated successfully, but these errors were encountered: