Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

get metadata downloads the entire blob #569

Open
yan-hic opened this issue Jan 10, 2023 · 1 comment
Open

get metadata downloads the entire blob #569

yan-hic opened this issue Jan 10, 2023 · 1 comment

Comments

@yan-hic
Copy link

yan-hic commented Jan 10, 2023

Not sure if it is by design but I have noticed that in order to read the metadata of a blob, the library actually downloads it, before parsing and rendering the metadata dict.
If this is the case, this is highly inefficient e.g. one needs to download a 2GB file to read the content_type or any custom metadata.

If confirmed, this is a showstopper for us.

Also, list_blobs only returns name so can't use it to read the metadata - which would be more efficient.

@guseggert
Copy link

download_metadata() just downloads the metadata, it calls https://cloud.google.com/storage/docs/json_api/v1/objects/get without alt=media.

resp_bytes = await client._download(bucket, key)
obj_size = json.loads(resp_bytes.decode())['size']
print("obj_size", obj_size)
print("bytes", len(resp_bytes))

results in

obj_size 1634987927
bytes 874

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants