-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial draft of the batch requests API HLD. #959
base: master
Are you sure you want to change the base?
Conversation
* In the current proposed APIs, a batched request can only contain similar entries - for example - either all SET or all DEL entries. A mix of SET and DEL in a batch is not supported. | ||
* A batched request can only apply to a single APPL DB table. Requests to different tables cannot be batched. | ||
* Order is not maintained in a batch request. The entries within a single batched request will be handled in any order. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this not cause a problem for SET case, if order is not maintained for SET, two sets on same object can result in random state at the end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two sets of the same object will have the final result of the combined attributes.
Order-less is the limitation of the current ProducerStateTable APIs, for both batch and non-batch APIs. Even in the non-batch APIs, if the producer writes multiple requests in a short time, the consumer might pop multiple entries at the time it wakes up. The reason of the order-less is that the API uses a Redis set to store the new keys, which is order-less.
What is the measured performance difference? And what are the memory usage implications? |
About performance difference, our data shows about 8% improvement in batch write into the DB. This is just measuring the time to write to the APPL DB. End-to-end performance improvement will be even better if the orchagent supports bulk SAI calls. (But that is also platform dependent.) |
|
||
## Overview | ||
|
||
When an application writes an entry into APPL DB, it uses the ProducerStateTable API. Currently, the ProducerStateTable API does not support batched writes. In the current form, the application needs to call the API multiple times for programming a batch of requests. This HLD proposes a new API in ProducerStateTable that can allow applications to program multiple entries in a single invocation. This can reduce the number of Redis DB notifications and potentially reduce the latency for responses when the response path is enabled. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you explain "potentially reduce the latency for responses when the response path is enabled"?
- "response path" is not well defined.
- For latency issue, there is a pipeline feature developed. Please investigate if it fits your use case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Response path is referring to Initial version of APPL STATE DB & Response Path HLD #846. It is end-to-end sync mode from the application to the hardware. Should we add a reference here? The HLD pr is not merged yet.
- The latency reduction in on orchagent making bulk SAI call. (For example, the p4orch route manager makes bulk SAI calls to program routes.) This feature will send one final notification at the end. Comparing to the individual calls, which will send notification for each call, this new batch API will prevent orchagent to wake up prematurely, so that orchagent can take full advantage of the batch in the bulk SAI call.
swss-common: Added batched APIs in ProducerStateTable. HLD: sonic-net/SONiC#959
8498931
to
8837dc2
Compare
@bocon13 |
swss-common: Added batched APIs in ProducerStateTable. HLD: sonic-net/SONiC#959
The previous design uses m_queueLength to ensure that each data will be completely consumed. Of course, this will also lead to a certain performance, because after the pop has been completed, the follow-up still needs to have io operations with redis server. This batch design, the scenario which the amount of data is greater than the amount that orch can pop, cause it will only be published once, and will there be residual data that is not consumed by the orch? |
Orch will pop all data available: https://github.com/sonic-net/sonic-swss/blob/ebe8de73e74cf3d9fd298b8b74b5da797c889c40/orchagent/orch.cpp#L228 Of course, that might have other issues like starvation, which is what the pop size is for. |
Initial draft of the batch requests API HLD.