Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(nebula_hw_interfaces): better UDP socket #231

Merged
merged 38 commits into from
Dec 23, 2024
Merged

Conversation

mojomex
Copy link
Collaborator

@mojomex mojomex commented Nov 21, 2024

PR Type

  • Improvement

Description

The current Boost.ASIO/transport_drivers implementation is bloated and does not offer all the features we need for accurate timing and packet loss measurement.
Specifically, a good equivalent to recvmsg (see man recvmsg for details) is not supported.

This PR introduces a new, minimal and robust UDP socket implementation with the following features:

*: Depending on the network interface hardware, the timestamp will be measured in hardware on packet arrival, or by the kernel in software as soon as possible after. In any case, the timing is much more accurate than doing it in user space, where scheduling has a huge impact on accuracy.

Usage

I aimed to document the class as well as possible, but still, here is a quick rundown of how to use the socket:

#include <nebula_hw_interfaces/nebula_hw_interfaces_common/connections/udp.hpp>

void my_func {
  using nebula::drivers::connections::UdpSocket;
  using nebula::drivers::connections::SocketError;
  using nebula::drivers::connections::UsageError;

  // Creates the underlying socket and sets timestamping, overflow reporting, etc. as options.
  UdpSocket sock{};

  // Sets the host IP and port. No actual socket operations happen at this point.
  sock.init("192.168.1.10", 1234);

  // Binds (= activates) the socket on the given host IP/port.
  sock.bind();

  // Forwards all received packets, with metadata (timestamp, packet drops, etc.) to `my_function`.
  sock.subscribe(my_funcion);

  // Stops forwarding packets. May have to be called manually if a lambda with reference-type captures
  // has been used as `my_function`, and its lifetime is shorter than that of `sock`.
  sock.unsubscribe();
}

Functions can also be chained like this:

auto sock = UdpSocket().init(...).bind().subscribe(...);

Pre-Review Checklist for the PR Author

PR Author should check the checkboxes below when creating the PR.

  • Assign PR to reviewer

Checklist for the PR Reviewer

Reviewers should check the checkboxes below before approval.

  • Commits are properly organized and messages are according to the guideline
  • (Optional) Unit tests have been written for new behavior
  • PR title describes the changes

Post-Review Checklist for the PR Author

PR Author should check the checkboxes below before merging.

  • All open points are addressed and tracked via issues or tickets

CI Checks

  • Build and test for PR: Required to pass before the merge.

Signed-off-by: Max SCHMELLER <max.schmeller@tier4.jp>
Signed-off-by: Max SCHMELLER <max.schmeller@tier4.jp>
Signed-off-by: Max SCHMELLER <max.schmeller@tier4.jp>
Signed-off-by: Max SCHMELLER <max.schmeller@tier4.jp>
Signed-off-by: Max SCHMELLER <max.schmeller@tier4.jp>
Signed-off-by: Max SCHMELLER <max.schmeller@tier4.jp>
Signed-off-by: Max SCHMELLER <max.schmeller@tier4.jp>
Signed-off-by: Max SCHMELLER <max.schmeller@tier4.jp>
…o everywhere

Signed-off-by: Max SCHMELLER <max.schmeller@tier4.jp>
Signed-off-by: Max SCHMELLER <max.schmeller@tier4.jp>
Signed-off-by: Max SCHMELLER <max.schmeller@tier4.jp>
…down

Signed-off-by: Max SCHMELLER <max.schmeller@tier4.jp>
Copy link

codecov bot commented Nov 21, 2024

Codecov Report

Attention: Patch coverage is 71.36564% with 65 lines in your changes missing coverage. Please review.

Project coverage is 27.21%. Comparing base (3284357) to head (0c5e3c0).
Report is 11 commits behind head on main.

Files with missing lines Patch % Lines
nebula_hw_interfaces/test/common/test_udp.cpp 59.21% 7 Missing and 24 partials ⚠️
...es/nebula_hw_interfaces_common/connections/udp.hpp 75.60% 14 Missing and 16 partials ⚠️
...ebula_hw_interfaces/test/common/test_udp/utils.hpp 84.00% 0 Missing and 4 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #231      +/-   ##
==========================================
+ Coverage   26.10%   27.21%   +1.11%     
==========================================
  Files         100      104       +4     
  Lines        9218     9458     +240     
  Branches     2215     2322     +107     
==========================================
+ Hits         2406     2574     +168     
- Misses       6423     6449      +26     
- Partials      389      435      +46     
Flag Coverage Δ
differential 27.21% <71.36%> (?)
total ?

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

mojomex and others added 8 commits November 22, 2024 11:01
Signed-off-by: Max SCHMELLER <max.schmeller@tier4.jp>
Signed-off-by: Max SCHMELLER <max.schmeller@tier4.jp>
Signed-off-by: Max SCHMELLER <max.schmeller@tier4.jp>
Signed-off-by: Max SCHMELLER <max.schmeller@tier4.jp>
Signed-off-by: Max SCHMELLER <max.schmeller@tier4.jp>
…received

Signed-off-by: Max SCHMELLER <max.schmeller@tier4.jp>
@mojomex
Copy link
Collaborator Author

mojomex commented Nov 22, 2024

Unit tests have been written, but things like querying net.core.rmem_max, or simulating packet loss, is not easily possible in Docker in CI, so test coverage is currently at 75%.

@mojomex mojomex self-assigned this Nov 22, 2024
@mojomex mojomex requested a review from drwnz November 22, 2024 05:20
mojomex added a commit that referenced this pull request Nov 26, 2024
Signed-off-by: Max SCHMELLER <max.schmeller@tier4.jp>
mojomex added a commit that referenced this pull request Nov 26, 2024
Signed-off-by: Max SCHMELLER <max.schmeller@tier4.jp>
Copy link
Collaborator

@drwnz drwnz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some comments! Many are questions and there are no must-do changes but please have a look and I will approve once confirmed.

* @brief Gracefully stops the active receiver thread (if any) but keeps the socket alive. The
* same socket can later be subscribed again.
*/
UdpSocket & unsubscribe()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the socket is kept alive, should we consider shutting it down to prevent reading from it?
shutdown(sock_fd_, SHUT_RD);
and associated error handling.

Copy link
Collaborator Author

@mojomex mojomex Dec 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that

  1. the only reads made are in the now-stopped receive_thread_
  2. the user might want to re-subscribe with the same socket later (kind of an esoteric argument maybe...)
  3. I'm not sure how to re-open a connection once shut down

I think it would best be left open. As far as I understand, since the socket is not actually closed by shutdown, there should not be any performance/resource difference between doing/not doing it here.


~UdpSocket()
{
unsubscribe();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better to leave any subscribed multicast groups on destruction? (or in unsubscribe)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that Linux does this automatically: StackOverflow.

mojomex and others added 13 commits December 13, 2024 14:34
Signed-off-by: Max SCHMELLER <max.schmeller@tier4.jp>
Signed-off-by: Max SCHMELLER <max.schmeller@tier4.jp>
Signed-off-by: Max SCHMELLER <max.schmeller@tier4.jp>
Signed-off-by: Max SCHMELLER <max.schmeller@tier4.jp>
Signed-off-by: Max SCHMELLER <max.schmeller@tier4.jp>
Signed-off-by: Max SCHMELLER <max.schmeller@tier4.jp>
Signed-off-by: Max SCHMELLER <max.schmeller@tier4.jp>
Signed-off-by: Max SCHMELLER <max.schmeller@tier4.jp>
Signed-off-by: Max SCHMELLER <max.schmeller@tier4.jp>
Signed-off-by: Max SCHMELLER <max.schmeller@tier4.jp>
Signed-off-by: Max SCHMELLER <max.schmeller@tier4.jp>
… a compiler error

Signed-off-by: Max SCHMELLER <max.schmeller@tier4.jp>
@knzo25
Copy link
Collaborator

knzo25 commented Dec 19, 2024

nit: (i.e., ignore as needed)
There are several switches between private and public, which got a little bit to keep track while reviewing. I understand that for header only libraries this is needed due to the order of declarations, but are all of them necessary? (4 access modifiers + the default at first)

@knzo25
Copy link
Collaborator

knzo25 commented Dec 20, 2024

Thanks for the PR. It had been a while since I got to see direct socket programming 😃
Other than the cosmetic changes I asked about, I wonder if for future PRs (the least changes in this PR the happier products will be I think), the throw statements can be handled for robustness and some debug messages added for example when a connection finishes, unexpected ips are sending data, etc
🙏

Signed-off-by: Max SCHMELLER <max.schmeller@tier4.jp>
Signed-off-by: Max SCHMELLER <max.schmeller@tier4.jp>
Signed-off-by: Max SCHMELLER <max.schmeller@tier4.jp>
…lity modifiers

Signed-off-by: Max SCHMELLER <max.schmeller@tier4.jp>
@mojomex mojomex requested review from drwnz and knzo25 December 23, 2024 05:38
@mojomex
Copy link
Collaborator Author

mojomex commented Dec 23, 2024

@drwnz @knzo25 Thank you for your feedback, I think I addressed everything now.

@knzo25

  • I reduced the number of visibility modifiers by reordering the UdpSocket class a bit in 3d68921.
  • As for the throw statements / error handling:
    • Exceptions are only thrown in the builder (i.e. during setup), and only if there is no easy way to enforce correct usage at compile time.
    • For errors during operation (the phase between subscribe and unsubscribe), errors are collected in RxMetadata structs and handed to the user along with the received packet. I agree that currently not a lot of errors are handled, but your concern with packets from different IPs should have been addressed in 237c2b0. For the multicast case, we could add a counter of packets that were removed by the sender endpoint filter.

Copy link
Collaborator

@knzo25 knzo25 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM !

…like optional.emplace() etc.

Signed-off-by: Max SCHMELLER <max.schmeller@tier4.jp>
@mojomex mojomex merged commit dc69e52 into main Dec 23, 2024
13 checks passed
@mojomex mojomex deleted the feat/better-udp-socket branch December 23, 2024 06:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants