Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(udp): pre-allocate receive buffer #1707

Merged
merged 1 commit into from
Mar 3, 2024
Merged

Conversation

mxinden
Copy link
Collaborator

@mxinden mxinden commented Mar 2, 2024

Instead of allocating a new receive buffer on each call to neqo_common::udp::Socket::recv, preallocate a long-lived buffer in Socket::bind and reuse this buffer on each read.

See flamegraphs below. With the patch neqo_common::udp::Socket::recv simply delegates to tokio::net::udp::UdpSocket::try_io instead of spending significant time in memset.

Quick-fix while working on #1693.

Addresses first performance issue in #1690.

Before:

flamegraph

After:

patch

Instead of allocating a new receive buffer on each call to
`neqo_common::udp::Socket::recv`, preallocate a long-lived buffer in
`Socket::bind` and reuse this buffer on each read.
Copy link

codecov bot commented Mar 2, 2024

Codecov Report

Attention: Patch coverage is 60.00000% with 4 lines in your changes are missing coverage. Please review.

Project coverage is 92.97%. Comparing base (ad027cf) to head (6d3491e).

Files Patch % Lines
neqo-common/src/udp.rs 60.00% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1707      +/-   ##
==========================================
- Coverage   92.98%   92.97%   -0.01%     
==========================================
  Files         120      120              
  Lines       37340    37342       +2     
==========================================
- Hits        34719    34718       -1     
- Misses       2621     2624       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Comment on lines 111 to 117
Ok(Some(Datagram::new(
meta.addr,
*local_address,
meta.ecn.map(|n| IpTos::from(n as u8)).unwrap_or_default(),
None, // TODO: get the real TTL https://github.com/quinn-rs/quinn/issues/1749
&buf[..meta.len],
&self.recv_buf[..meta.len],
)))
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that each call to Datagram::new allocates the datagram's data in a new Vec.

impl Datagram {
pub fn new<V: Into<Vec<u8>>>(
src: SocketAddr,
dst: SocketAddr,
tos: IpTos,
ttl: Option<u8>,
d: V,
) -> Self {
Self {
src,
dst,
tos,
ttl,
d: d.into(),
}
}

Changing this is a larger undertaking I would like to tackle with #1693.

@larseggert larseggert added this pull request to the merge queue Mar 3, 2024
Merged via the queue into mozilla:main with commit a2d525b Mar 3, 2024
11 of 13 checks passed
@@ -31,6 +32,7 @@ impl Socket {
Ok(Self {
state: quinn_udp::UdpSocketState::new((&socket).into())?,
socket: tokio::net::UdpSocket::from_std(socket)?,
recv_buf: vec![0; u16::MAX as usize],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is too large. I get that we don't want to overflow, but unless we are receiving multiple datagrams, we only need 2k or so (a constant might be a good idea for that limit).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this too large? This is about ~43 packets, which is smaller than realistic WAN congestion windows. Are you worried about Fx memory usage of this is per-socket?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ignore the comment. I was reading this patch on its own and missed the follow-up that includes multiple packets.

I do think that a constant is the right thing to use for this, rather than the (rather arbitrary) u16::MAX.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think that a constant is the right thing to use for this, rather than the (rather arbitrary) u16::MAX.

👍 I prepared #1709.

I was reading this patch on its own and missed the follow-up that includes multiple packets.

For others to follow, this is #1708.

mxinden added a commit to mxinden/neqo that referenced this pull request May 4, 2024
There are two server implementations based on neqo:

1. https://github.com/mozilla/neqo/tree/main/neqo-bin/src/server
  - http3 and http09 implementation
  - used for manual testing and QUIC Interop

2. https://searchfox.org/mozilla-central/source/netwerk/test/http3server/src/main.rs
  - used to test Firefox

I assume one was once an exact copy of the other. Both implement their own I/O,
event loop, ... Since then, the two implementations diverged significantly.
Especially (1) saw a lot of improvements in recent months:

- mozilla#1564
- mozilla#1569
- mozilla#1578
- mozilla#1581
- mozilla#1604
- mozilla#1612
- mozilla#1676
- mozilla#1692
- mozilla#1707
- mozilla#1708
- mozilla#1727
- mozilla#1753
- mozilla#1756
- mozilla#1766
- mozilla#1772
- mozilla#1786
- mozilla#1787
- mozilla#1788
- mozilla#1794
- mozilla#1806
- mozilla#1808
- mozilla#1848
- mozilla#1866

At this point, bugs in (2) are hard to fix, see e.g.
mozilla#1801.

This commit merges (2) into (1), thus removing all duplicate logic and
having (2) benefit from all the recent improvements to (1).
KershawChang pushed a commit to KershawChang/neqo that referenced this pull request May 7, 2024
There are two server implementations based on neqo:

1. https://github.com/mozilla/neqo/tree/main/neqo-bin/src/server
  - http3 and http09 implementation
  - used for manual testing and QUIC Interop

2. https://searchfox.org/mozilla-central/source/netwerk/test/http3server/src/main.rs
  - used to test Firefox

I assume one was once an exact copy of the other. Both implement their own I/O,
event loop, ... Since then, the two implementations diverged significantly.
Especially (1) saw a lot of improvements in recent months:

- mozilla#1564
- mozilla#1569
- mozilla#1578
- mozilla#1581
- mozilla#1604
- mozilla#1612
- mozilla#1676
- mozilla#1692
- mozilla#1707
- mozilla#1708
- mozilla#1727
- mozilla#1753
- mozilla#1756
- mozilla#1766
- mozilla#1772
- mozilla#1786
- mozilla#1787
- mozilla#1788
- mozilla#1794
- mozilla#1806
- mozilla#1808
- mozilla#1848
- mozilla#1866

At this point, bugs in (2) are hard to fix, see e.g.
mozilla#1801.

This commit merges (2) into (1), thus removing all duplicate logic and
having (2) benefit from all the recent improvements to (1).
github-merge-queue bot pushed a commit that referenced this pull request May 8, 2024
* refactor(bin): introduce server/http3.rs and server/http09.rs

The QUIC Interop Runner requires an http3 and http09 implementation for both
client and server. The client code is already structured into an http3 and an
http09 implementation since #1727.

This commit does the same for the server side, i.e. splits the http3 and http09
implementation into separate Rust modules.

* refactor: merge mozilla-central http3 server into neqo-bin

There are two server implementations based on neqo:

1. https://github.com/mozilla/neqo/tree/main/neqo-bin/src/server
  - http3 and http09 implementation
  - used for manual testing and QUIC Interop

2. https://searchfox.org/mozilla-central/source/netwerk/test/http3server/src/main.rs
  - used to test Firefox

I assume one was once an exact copy of the other. Both implement their own I/O,
event loop, ... Since then, the two implementations diverged significantly.
Especially (1) saw a lot of improvements in recent months:

- #1564
- #1569
- #1578
- #1581
- #1604
- #1612
- #1676
- #1692
- #1707
- #1708
- #1727
- #1753
- #1756
- #1766
- #1772
- #1786
- #1787
- #1788
- #1794
- #1806
- #1808
- #1848
- #1866

At this point, bugs in (2) are hard to fix, see e.g.
#1801.

This commit merges (2) into (1), thus removing all duplicate logic and
having (2) benefit from all the recent improvements to (1).

* Move firefox.rs to mozilla-central

* Reduce HttpServer trait functions

* Extract constructor

* Remove unused deps

* Remove clap color feature

Nice to have. Adds multiple dependencies. Hard to justify for mozilla-central.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants