Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve binary read performance #459

Merged
merged 8 commits into from
May 24, 2023
Merged

Improve binary read performance #459

merged 8 commits into from
May 24, 2023

Conversation

kbongort
Copy link
Contributor

@kbongort kbongort commented Apr 17, 2023

This PR makes a couple of improvements to binary read performance:

  1. One of the slowest things during a read with many messages is initializing a BinaryReader for each message with its DataView and TextDecoder. This PR adds a messageField() method to BinaryReader that reads a size-delimited message field without constructing a new BinaryReader. (Message.fromBinary is refactored to share code). A knock-on effect of this change is that we pass options along without repeated calls to makeReadOptions, which otherwise needlessly spreads options into readDefaults even when options is already identical to readDefaults.
  2. readScalar was calling reader[method]() which seems to incur an implicit call to .bind() on the reader method (i.e. it is equivalent to reader[method].bind(reader)()). By using an inline switch statement we can avoid that indirection and use a faster path.

I put together a simple benchmark (https://github.com/kbongort/protobuf-es/pull/1) using the following message definitions:

message User {
  string first_name = 1;
  string last_name = 2;
  bool active = 3;
  User manager = 4;
  repeated string locations = 5;
  map<string, string> projects = 6;
  int32 number = 7;
}

message UserList {
  repeated User users = 1;
}

The benchmark uses both a more complete version of the User message and one with only a few fields set:

const userMessage = new User({
  firstName: "Jane",
  lastName: "Doe",
  active: true,
  manager: { firstName: "Jane", lastName: "Doe", active: false },
  locations: ["Seattle", "New York", "Tokyo"],
  projects: { foo: "project foo", bar: "project bar" },
});

const smallUserMessage = new User({
  manager: { active: true },
  number: 2,
});

After warming up the cache by reading a serialized userMessage 1000 times, it measures the time to read userMessage, to read a UserList of 1000 userMessages, and to read a UserList of 1000 smallUserMessages. The amount of time it takes to call JSON.parse() on the JSON version of the message is provided as a point of comparison. Without the changes in this PR, the results are:

Benchmark: Single user
	Time to parse JSON: 1135 µs
	Time to parse from binary: 3548 µs
	Ratio: 312%
Benchmark: List of 1000 users
	Time to parse JSON: 920 ms
	Time to parse from binary: 2957 ms
	Ratio: 321%
Benchmark: List of 1000 small users
	Time to parse JSON: 260 ms
	Time to parse from binary: 923 ms
	Ratio: 355%

After the changes in this PR, the results are:

Benchmark: Single user
	Time to parse JSON: 1088 µs
	Time to parse from binary: 2755 µs
	Ratio: 253%
Benchmark: List of 1000 users
	Time to parse JSON: 921 ms
	Time to parse from binary: 1729 ms
	Ratio: 188%
Benchmark: List of 1000 small users
	Time to parse JSON: 255 ms
	Time to parse from binary: 237 ms
	Ratio: 93%

The three test cases show speedups of 1.3x, 1.7x, and 3.9x, respectively. The effect of (1) is especially pronounced when there are many small messages, as in the last benchmark test.

@kbongort kbongort marked this pull request as ready for review April 17, 2023 07:24
@CLAassistant
Copy link

CLAassistant commented Apr 29, 2023

CLA assistant check
All committers have signed the CLA.

@smaye81
Copy link
Member

smaye81 commented May 8, 2023

Hi @kbongort appreciate your patience on this. I know we spoke offline last week. We have been gathering some baseline metrics on our performance to see all the areas that need addressed. I can confirm that this does improve binary read performance by an average of ~2.5x.

However, one concern is the change to modify the IBinaryReader interface. This is an interface we expose from the library and changing it will result in breaking changes. Are you up for potentially modifying the PR so that these changes can be made in a non-breaking way without affecting the interface contract?

@kbongort
Copy link
Contributor Author

kbongort commented May 9, 2023

Hi @kbongort appreciate your patience on this. I know we spoke offline last week. We have been gathering some baseline metrics on our performance to see all the areas that need addressed. I can confirm that this does improve binary read performance by an average of ~2.5x.

However, one concern is the change to modify the IBinaryReader interface. This is an interface we expose from the library and changing it will result in breaking changes. Are you up for potentially modifying the PR so that these changes can be made in a non-breaking way without affecting the interface contract?

What would you suggest?

@smaye81
Copy link
Member

smaye81 commented May 9, 2023

Let me talk it over with the team. We know that the fromBinary > readMessage path is a bottleneck, but I'm not sure if there is a good way to reuse the BinaryReader like you're doing without breaking one of the interfaces.

In the meantime, would you mind splitting out the two improvements (BinaryReader vs. readScalar). I think we can probably get the latter implemented and at least get a quick win.

@kbongort
Copy link
Contributor Author

kbongort commented May 9, 2023

Let me talk it over with the team. We know that the fromBinary > readMessage path is a bottleneck, but I'm not sure if there is a good way to reuse the BinaryReader like you're doing without breaking one of the interfaces.

In the meantime, would you mind splitting out the two improvements (BinaryReader vs. readScalar). I think we can probably get the latter implemented and at least get a quick win.

What type of breakage are you concerned with, exactly? This does add a couple of methods to IBinaryReader. Are you worried that clients may have IBinaryReader implementations that will no longer conform to the interface, or something else?

@kbongort
Copy link
Contributor Author

I've updated the PR to not modify IBinaryReader. Since the new methods are implemented strictly in terms of existing ones, it's possible to refactor them into static utility methods. So that's what I did.

@kbongort
Copy link
Contributor Author

@smaye81 PTAL – Let me know if you have any further concerns, thanks!

@kbongort
Copy link
Contributor Author

kbongort commented May 23, 2023 via email

@smaye81
Copy link
Member

smaye81 commented May 23, 2023

Hi @kbongort -- yes, apologies for the delay. we have been a bit heads-down on some other work at the moment, but will take a look at this hopefully this week. thanks again for the PR.

cc @timostamm

Copy link
Member

@timostamm timostamm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @kbongort 👋

We added basic performance benchmarks in #491. They confirm what you already measured, just provide a little bit more detail.

The downside of keeping the BinaryReader is that it makes it more difficult to generate speed-optimized code, which is the best lever for performance we have. But looking at the change in isolation, it does speed up the test "large google.protobuf.FileDescriptorSet" by ~35%. This seems worth it.

I pushed up 126ba29 to avoid T.fromBinary() in readMapEntry(), and 77460d3 to move BinaryReaderUtil.readMessage() to a local function. It keeps the performance improvement, and reduces the bundle size increase to negligible 0.02%.

The readScalar() change is a 1% increase in bundle size (that's the reason it wasn't a switch statement in the first place). But it's a really decent performance bump with 55% for "large google.protobuf.FileDescriptorSet". This is worth it. Good catch 🙂

Thank you for the contribution, LGTM!

@timostamm timostamm merged commit ce34412 into bufbuild:main May 24, 2023
@kbongort
Copy link
Contributor Author

Thanks for the review & further work! 🍾

@smaye81 smaye81 mentioned this pull request May 30, 2023
smaye81 added a commit that referenced this pull request May 30, 2023
## What's Changed
* Improve binary read performance by @kbongort in
#459
* Update to Protobuf 23.2 by @smaye81 in
#492

## New Contributors
* @kbongort made their first contribution in #459.

**Full Changelog**:
https://github.com/bufbuild/protobuf-es/compare/v1.2.0..v1.2.1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants