Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: Add snippets for upload_chunks_concurrently and add chunk_size #1135

Merged
merged 6 commits into from
Oct 9, 2023

Conversation

andrewsg
Copy link
Contributor

Adds a snippet for upload_chunks_concurrently; also adds chunk_size to an existing snippet, and runs snippets_test.py through black for formatting purposes.

@andrewsg andrewsg requested review from a team as code owners September 30, 2023 00:04
@andrewsg andrewsg requested a review from ohmayr September 30, 2023 00:04
@snippet-bot
Copy link

snippet-bot bot commented Sep 30, 2023

Here is the summary of changes.

You are about to add 1 region tag.

This comment is generated by snippet-bot.
If you find problems with this result, please file an issue at:
https://github.com/googleapis/repo-automation-bots/issues.
To update this comment, add snippet-bot:force-run label or use the checkbox below:

  • Refresh this comment

@product-auto-label product-auto-label bot added size: m Pull request size is medium. api: storage Issues related to the googleapis/python-storage API. labels Sep 30, 2023
@@ -0,0 +1,56 @@
# Copyright 2022 Google LLC
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2023

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will fix

# the use case. The remote service has a minimum of 5 MiB and a maximum of
# 5 GiB.
# chunk_size = 32 * 1024 * 1024 (32 MiB)

# The maximum number of processes to use for the operation. The performance
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the rule of thumb here: "number of cores your CPU has"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, workloads with small files benefit from many times that number and workloads with large files max out the NIC below that, so the number of cores is not a good starting place.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this something that would go into docs instead of sample comments?

BLOB_NAME = "test_file.txt"

with tempfile.NamedTemporaryFile() as file:
file.write(b"test")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this doesn't exercise multiple chunks; recommend increasing the object size to test chunk_size

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's sufficient to test the snippet. The feature itself is not under test here - it is fully covered in the integration tests, with appropriately-sized test files.

source_filename,
destination_blob_name,
chunk_size=32 * 1024 * 1024,
processes=8,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

workers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it might be easier to understand as "workers" for this and the other samples, I'll change it.

@andrewsg andrewsg merged commit 3a0f551 into main Oct 9, 2023
5 checks passed
@andrewsg andrewsg deleted the tm-mpu-sample branch October 9, 2023 22:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: storage Issues related to the googleapis/python-storage API. size: m Pull request size is medium.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants