improve vsomeip sluggish connect (maintain 3.1.x branch) #671

joeyoravec · 2024-04-10T20:41:18Z

fixes #669

As described in issue #669 the train logic should aggregate Service Discovery, but does not currently. This leads to very low performance on systems offering 1000s of EventGroups.

This PR has similar content to #670 but backported to 3.1.20.3 and containing one additional important fix for that older version. I'm leaving this PR in draft status to gather feedback. The key difference when backporting is:

Ensure departure timer does not go negative

There is a calculation 3.1.20 in implementation/endpoints/include/buffer.hpp that results in a negative number in case the timer is already expired:

void update_departure_time_and_stop_departure() {
    departure_ = departure_timer_->expires_from_now();

We need to implement a floor() so the departure is either in the future, or zero meaning ready to depart, but never negative. Like:

departure_ = departure_timer_->expires_from_now();
if (departure_.count() < 0) {
    departure_ = std::chrono::nanoseconds::zero();
}

In case the value goes negative then other calculations fail. That large negative number is kept and used for the next calculated departure time, breaking that one too.

Debouncing would make sense if it was applied on a message-by-message basis so each gets transmitted "no more often than Xms" dropping all updates except the final one. However upstream code applies this parameter to decide when an entire multi-passenger train should depart. Remove this logic; we're not using the feature, it doesn't work the way we expect, and it prevents aggregating multiple passengers into trains.

Upstream logic overrides retention and debounce to zero for SD messages, forcing any train containing SD passengers to depart immediately. Use configured times, allowing a train to aggregate multiple SD passengers.

Calling expires_from_now() on expired timer will result in a negative value. It is kept and used for the next calculated departure time, breaking that one too. Implement a floor() so the departure is either in the future, or zero meaning ready to depart, but never negative.

Our design uses separate EventIDs for each attribute. Currently we see ~3500 total subscribes, followed by ~3500 acknowledgements, transmitted one-per-frame. The entire sequence does not finish within a two second SD interval leading to timeouts, retries, and poor performance. All SD messages have the same identifier, so this logic forced the train must_depart ~3500 times carrying a single message. Very inefficient. This change improves performance by allowing multiple passengers with the same identifier aboard the same train; resulting in fewer, larger frames on the wire.

goncaloalmeida · 2024-05-03T14:48:09Z

@joeyoravec is this ready to review or are you still working on this?

joeyoravec · 2024-05-04T00:20:59Z

@joeyoravec is this ready to review or are you still working on this?

I've left this in draft so it's clear that this is not ready for merge. Yes this "works" and is worth reviewing. However I read that COVESA ended all support for https://github.com/COVESA/vsomeip/tree/maintain/3.1 since Apr 2023. If nothing is getting merged to that branch then this backport might only be relevant for users who want to cherry-pick to their own branches.

Eugene Kozlov added 4 commits April 10, 2024 15:30

Use configured retention and debounce times for SD messages

0465423

Upstream logic overrides retention and debounce to zero for SD messages, forcing any train containing SD passengers to depart immediately. Use configured times, allowing a train to aggregate multiple SD passengers.

joeyoravec mentioned this pull request Apr 10, 2024

[BUG]: vsomeip slow to establish communication with lots of EventGroup #669

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

improve vsomeip sluggish connect (maintain 3.1.x branch) #671

improve vsomeip sluggish connect (maintain 3.1.x branch) #671

joeyoravec commented Apr 10, 2024 •

edited

Loading

goncaloalmeida commented May 3, 2024

joeyoravec commented May 4, 2024

improve vsomeip sluggish connect (maintain 3.1.x branch) #671

Are you sure you want to change the base?

improve vsomeip sluggish connect (maintain 3.1.x branch) #671

Conversation

joeyoravec commented Apr 10, 2024 • edited Loading

Ensure departure timer does not go negative

goncaloalmeida commented May 3, 2024

joeyoravec commented May 4, 2024

joeyoravec commented Apr 10, 2024 •

edited

Loading