-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature Request]: Decrease the number of GET requests GCSIO makes #28398
Comments
.take-issue |
@BjornPrime It would be great if the the amount of GET requests in GCSIO could be reduced. In particular the code seems to often call Furthermore As far as I can tell switching from Out of interest, are there any advantages of using the beams builtin GCSIO over a generic implementation like fsspec in pipeline? |
.take-issue |
What would you like to happen?
The new implementation of GCSIO (see PR #28079, if not already merged), contains several uses of GET requests to GCS when it is possible that simply instantiating the objects of interest would provide better performance, since GCS has some capability of figuring things out once a potent request (be it copy, delete, etc.) is made, based on bucket and blob names.
See here for an example of how this can be done.
Using this pattern was attempted during the initial migration to the GCS client, but floundered on concerns that confusing errors may be thrown in some instances. In particular, attempting to access a non-existent object through an operation involving a instantiated version sometimes threw permissions errors instead of 404s. To resolve this issue investigate this design pattern more thoroughly and see if the errors issues are real and can't be overcome. If they can be, swap out any unnecessary GETs for instantiations to improve performance. If they can't, connect with the GCS client team to share these concerns and see if they have any further info or workarounds.
Issue Priority
Priority: 2 (default / most feature requests should be filed as P2)
Issue Components
The text was updated successfully, but these errors were encountered: