-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
File metadata is lost during multipart S3 copy #367
Comments
For all objects under the threshold limit, we use one single call (PUT Object Copy API) that by default copies the metadata from source to destination ignoring a few specific headers However for multipart copies, the metadata needs to be set in the InitiateMultipart request and the SDK cannot determine what metadata needs to be copied from the source. There are encryption related headers that cannot be copied and needs to be explicitly specified by the user in the request. Is it feasible for you to explicitly set the metadata in request ? If not can you specify the use case ? |
The API should probably not handle metadata differently depending on file size. It is confusing behavior. This issue was quite difficult to track down. Yes, a reasonable workaround that we have already implemented is to query and explicitly set the existing metadata for large files. |
@gribbet apologies for the extended delay in getting back to you on this issue. Unfortunately as @manikandanrs said this is an issue with the S3 service rather than the Java SDK. The Python SDK actually has a similar issue (see aws/aws-cli#1145). The handling of metadata on a single copy request is actually done by S3 itself (via the Unfortunately S3 does not support the |
@gribbet I contacted the S3 service team and they are aware of the inconsistency - it's possible that they'll fix it in a future version of the service. However given there is a workaround there are higher priority issues to resolve. Given this is not a Java SDK specific problem I'm going to close this issue. I will communicate back when the service team resolves this inconsistency in multi-part copy. |
Since this seems as though it will never get fixed why not write your own s3 sync function that preserves metdata 🙄 Here's a really ugly one in node 8.x, hopefully this helps someone |
Original metadata is always dropped during copy for files larger than 5GB (where multipart copy is required). For smaller files the behavior is correct.
CopyCallable.initiateMultipartUpload is always setting NewObjectMetadata on the CopyObjectRequest so the original data is destroyed.
In my particular case I have a Content-Disposition header that does not get copied.
The text was updated successfully, but these errors were encountered: