fix(pubsub): include request overhead when computing publish batch size overflow #9911

plamut · 2019-12-03T22:35:55Z

This PR fixes the logic that computes the publish batch size overflow, taking the total request message size overhead into account. The improved logic prevents the server-side errors simular to the following:

google.api_core.exceptions.InvalidArgument: 400 The value for request_size is too large. You passed 10000096 in the request, but the maximum value is 10000000.

How to test

(see also the system test in this PR)

Create a pubisher client with BatchSettings.max_bytes substantially larger than 10_000_000, and BatchSettings.max_latency to one second (so that the publish autocommit does not kick in too soon).
Quickly publish a few sizable messages to a topic. Their total size should slightly exceed 10_000_000 bytes.

Actual result (before the fix):
The backend responds with a "400 InvalidArgument" error.

Expected result (after the fix):
All messages are successfully published (the code splits them into multiple publish batches).

PR checklist

Make sure to open an issue as a bug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
Ensure the tests and linter pass
Code coverage does not decrease (if any source code was changed)
Appropriate docs were updated (if necessary)

@relud

The maximum allowed size for a PublishRequest on the backend is lower than a mere sum of the byte sizes of individual messages. This commit adjusts the batch size overflow calculation to account for this overhead. It also caps the effective maximum BatchSetting.max_size value to 10_000_000 bytes (the limit on the backend). (credit also to GitHub @relud for outlining the main idea first in the issue description)

pubsub/google/cloud/pubsub_v1/publisher/_batch/thread.py

pubsub/tests/system.py

pubsub/tests/unit/pubsub_v1/publisher/batch/test_thread.py

pubsub/google/cloud/pubsub_v1/publisher/_batch/thread.py

pubsub/tests/unit/pubsub_v1/publisher/batch/test_thread.py

software-dov

Judgement call on your part to make a distinct exception class for test_publish_single_message_size_exceeeds_server_size_limit or leave as is, otherwise LGTM.

plamut · 2019-12-04T23:51:06Z

@software-dov Thanks for the quick review, addressed the last outstanding comment.

software-dov

LGTM

plamut added 2 commits December 3, 2019 11:15

Clarify the description of BatchSettings.max_bytes

550b0c6

plamut added the api: pubsub Issues related to the Pub/Sub API. label Dec 3, 2019

plamut requested a review from pradn December 3, 2019 22:35

plamut requested a review from anguillanneuf as a code owner December 3, 2019 22:35

googlebot added the cla: yes This human has signed the Contributor License Agreement. label Dec 3, 2019

software-dov reviewed Dec 3, 2019

View reviewed changes

pubsub/google/cloud/pubsub_v1/publisher/_batch/thread.py Outdated Show resolved Hide resolved

Access settings inside Batch in a consistent way.

843a10a

software-dov reviewed Dec 3, 2019

View reviewed changes

Cleanup and refactor a few code snippets

289935e

plamut requested a review from software-dov December 4, 2019 08:56

software-dov reviewed Dec 4, 2019

View reviewed changes

Raise more specific error if message too large

d4509ec

plamut requested a review from software-dov December 4, 2019 23:51

software-dov approved these changes Dec 4, 2019

View reviewed changes

plamut merged commit 0699ba6 into googleapis:master Dec 5, 2019

plamut deleted the iss-7108 branch December 5, 2019 07:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(pubsub): include request overhead when computing publish batch size overflow #9911

fix(pubsub): include request overhead when computing publish batch size overflow #9911

plamut commented Dec 3, 2019 •

edited

Loading

software-dov left a comment

plamut commented Dec 4, 2019

software-dov left a comment

fix(pubsub): include request overhead when computing publish batch size overflow #9911

fix(pubsub): include request overhead when computing publish batch size overflow #9911

Conversation

plamut commented Dec 3, 2019 • edited Loading

How to test

PR checklist

software-dov left a comment

Choose a reason for hiding this comment

plamut commented Dec 4, 2019

software-dov left a comment

Choose a reason for hiding this comment

plamut commented Dec 3, 2019 •

edited

Loading