-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Limit concurrency #329
base: main
Are you sure you want to change the base?
Limit concurrency #329
Conversation
This adds a new keyword to AzureBlobFileSystem to limit the number of concurrent connectiouns. See pangeo-forge/pangeo-forge-recipes#227 (comment) for some motivation. In that situation, we had a single FileSystem instance that was generating many concurrent write requests through `.pipe`. So many, that we were seeing memory issues from creating all the BlobClient connections simultaneously. This adds an asyncio.Semaphore instance to the AzureBlobFilesSytem that controls the number of concurrent BlobClient connections. The default of None is backwards-compatible (no limit)
Thanks for picking this up! One thing I've learned about since starting this PR, azure-storage-blob already uses the We might want to consider a different keyword (max_clients? max_blob_clients?) so that we can pass through |
Hey @TomAugspurger -- That sounds like a great idea. Unit tests were passing in my local. |
Picking up of #288