Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(client): exit with non-zero on error #1786

Merged
merged 3 commits into from
Apr 8, 2024

Conversation

mxinden
Copy link
Collaborator

@mxinden mxinden commented Apr 2, 2024

When a connection closes with an error, surface the error to the user and exit with non-zero.

When a connection closes with an error, surface the error to the user and exit
with non-zero.
Copy link

codecov bot commented Apr 2, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 93.14%. Comparing base (992d588) to head (190511d).
Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1786      +/-   ##
==========================================
+ Coverage   93.12%   93.14%   +0.02%     
==========================================
  Files         116      116              
  Lines       36097    36097              
==========================================
+ Hits        33614    33624      +10     
+ Misses       2483     2473      -10     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link

github-actions bot commented Apr 2, 2024

Benchmark results

Performance differences relative to 992d588.

  • coalesce_acked_from_zero 1+1 entries
    time: [195.18 ns 195.63 ns 196.12 ns]
    change: [-0.4124% -0.0484% +0.3129%] (p = 0.80 > 0.05)
    No change in performance detected.

  • coalesce_acked_from_zero 3+1 entries
    time: [236.41 ns 237.02 ns 237.66 ns]
    change: [-0.4125% -0.0550% +0.2890%] (p = 0.76 > 0.05)
    No change in performance detected.

  • coalesce_acked_from_zero 10+1 entries
    time: [234.52 ns 235.14 ns 235.91 ns]
    change: [-0.4245% +0.3992% +1.5824%] (p = 0.50 > 0.05)
    No change in performance detected.

  • coalesce_acked_from_zero 1000+1 entries
    time: [217.72 ns 217.91 ns 218.14 ns]
    change: [-1.0574% -0.3995% +0.2674%] (p = 0.25 > 0.05)
    No change in performance detected.

  • RxStreamOrderer::inbound_frame()
    time: [119.44 ms 119.51 ms 119.58 ms]
    change: [-0.0841% -0.0161% +0.0577%] (p = 0.67 > 0.05)
    No change in performance detected.

  • transfer/Run multiple transfers with varying seeds
    time: [121.24 ms 121.50 ms 121.77 ms]
    thrpt: [32.848 MiB/s 32.921 MiB/s 32.992 MiB/s]
    change:
    time: [-2.8515% -2.5592% -2.2777%] (p = 0.00 < 0.05)
    thrpt: [+2.3308% +2.6264% +2.9351%]
    Change within noise threshold.

  • transfer/Run multiple transfers with the same seed
    time: [121.82 ms 121.99 ms 122.15 ms]
    thrpt: [32.746 MiB/s 32.791 MiB/s 32.835 MiB/s]
    change:
    time: [-2.6810% -2.4902% -2.3046%] (p = 0.00 < 0.05)
    thrpt: [+2.3590% +2.5538% +2.7549%]
    Change within noise threshold.

  • 1-conn/1-100mb-resp (aka. Download)/client
    time: [1.0771 s 1.0843 s 1.0926 s]
    thrpt: [91.528 MiB/s 92.226 MiB/s 92.839 MiB/s]
    change:
    time: [-1.1085% +1.0484% +2.8386%] (p = 0.34 > 0.05)
    thrpt: [-2.7602% -1.0375% +1.1210%]
    No change in performance detected.

  • 1-conn/10_000-parallel-1b-resp (aka. RPS)/client
    time: [388.19 ms 391.07 ms 393.94 ms]
    thrpt: [25.384 Kelem/s 25.571 Kelem/s 25.761 Kelem/s]
    change:
    time: [-0.9880% +0.1474% +1.2847%] (p = 0.80 > 0.05)
    thrpt: [-1.2684% -0.1472% +0.9979%]
    No change in performance detected.

  • 1-conn/1-1b-resp (aka. HPS)/client
    time: [42.925 ms 43.057 ms 43.207 ms]
    thrpt: [23.144 elem/s 23.225 elem/s 23.296 elem/s]
    change:
    time: [+0.6523% +1.1594% +1.6530%] (p = 0.00 < 0.05)
    thrpt: [-1.6262% -1.1461% -0.6481%]
    Change within noise threshold.

Client/server transfer results

Transfer of 134217728 bytes over loopback.

Client Server CC Pacing Mean [ms] Min [ms] Max [ms] Relative
msquic msquic 725.0 ± 241.5 426.3 1169.9 1.00
neqo msquic reno on 2203.5 ± 286.6 1905.5 2531.6 1.00
neqo msquic reno 2098.6 ± 266.6 1902.5 2499.7 1.00
neqo msquic cubic on 2166.3 ± 322.3 1881.8 2679.9 1.00
neqo msquic cubic 2129.8 ± 279.8 1923.0 2634.3 1.00
msquic neqo reno on 3388.3 ± 235.2 3235.1 3807.6 1.00
msquic neqo reno 3314.5 ± 172.4 3093.4 3707.2 1.00
msquic neqo cubic on 3332.6 ± 29.8 3264.6 3358.9 1.00
msquic neqo cubic 3296.8 ± 106.4 3107.7 3517.8 1.00
neqo neqo reno on 3006.9 ± 167.4 2852.4 3452.1 1.00
neqo neqo reno 2999.2 ± 63.9 2946.9 3164.2 1.00
neqo neqo cubic on 3188.8 ± 167.8 3097.5 3663.7 1.00
neqo neqo cubic 3208.2 ± 221.1 3099.7 3821.1 1.00

⬇️ Download logs

@mxinden mxinden marked this pull request as ready for review April 8, 2024 11:32
Copy link

github-actions bot commented Apr 8, 2024

@larseggert larseggert added this pull request to the merge queue Apr 8, 2024
Merged via the queue into mozilla:main with commit 32a2a59 Apr 8, 2024
15 checks passed
@WesleyRosenblum
Copy link

WesleyRosenblum commented Apr 16, 2024

I think this may have broken some neqo-client QUIC interop tests.

I'm looking at the resumption (R) tests: https://interop.seemann.io/

Any server that does not send a connection close before neqo-client's idle timer expires causes the test to fail:

client  | 0s967ms INFO Saving https://server4/fbpsuqvmkw to "/downloads/fbpsuqvmkw"
client  | 1s  4ms WARN Unhandled event StateChange(Confirmed)
client  | 33s178ms INFO [Client 8ae08d14ec800655] idle timeout expired
client  | Error: TransportError(IdleTimeout)
client exited with code 1
Aborting on container exit...

I'm guessing this PR isn't the root cause, but it is causing an underlying issue (the client not exiting cleanly after the file has been transferred) to surface

@mxinden
Copy link
Collaborator Author

mxinden commented Apr 17, 2024

Thank you for tracing this down @WesleyRosenblum. I will take a look.

@larseggert
Copy link
Collaborator

@mxinden shouldn't the QNS CI step have caught that? (Did I break it?)

@mxinden
Copy link
Collaborator Author

mxinden commented Apr 17, 2024

Still need to take a look.

shouldn't the QNS CI step have caught that?

We are not running the resumption test on CI yet :( I am lagging behind with #1785.

test: handshake,ecn,keyupdate

mxinden added a commit to mxinden/neqo that referenced this pull request Apr 17, 2024
Test failure reported in mozilla#1786 (comment).

Signed-off-by: Max Inden <mail@max-inden.de>
github-merge-queue bot pushed a commit that referenced this pull request Apr 23, 2024
* feat(qns): add resumption testcase

Test failure reported in #1786 (comment).

Signed-off-by: Max Inden <mail@max-inden.de>

* fix: fallback for servers not sending NEW_TOKEN frame

See https://github.com/mozilla/neqo/blob/main/neqo-transport/src/connection/mod.rs#L665-L676 for details.

Fixes regression introduced in #1676.

* Trigger CI

* Still wait if there is none

* Revert "Still wait if there is none"

This reverts commit 710c500.

* Refactor resumption logic

---------

Signed-off-by: Max Inden <mail@max-inden.de>
mxinden added a commit to mxinden/neqo that referenced this pull request May 4, 2024
There are two server implementations based on neqo:

1. https://github.com/mozilla/neqo/tree/main/neqo-bin/src/server
  - http3 and http09 implementation
  - used for manual testing and QUIC Interop

2. https://searchfox.org/mozilla-central/source/netwerk/test/http3server/src/main.rs
  - used to test Firefox

I assume one was once an exact copy of the other. Both implement their own I/O,
event loop, ... Since then, the two implementations diverged significantly.
Especially (1) saw a lot of improvements in recent months:

- mozilla#1564
- mozilla#1569
- mozilla#1578
- mozilla#1581
- mozilla#1604
- mozilla#1612
- mozilla#1676
- mozilla#1692
- mozilla#1707
- mozilla#1708
- mozilla#1727
- mozilla#1753
- mozilla#1756
- mozilla#1766
- mozilla#1772
- mozilla#1786
- mozilla#1787
- mozilla#1788
- mozilla#1794
- mozilla#1806
- mozilla#1808
- mozilla#1848
- mozilla#1866

At this point, bugs in (2) are hard to fix, see e.g.
mozilla#1801.

This commit merges (2) into (1), thus removing all duplicate logic and
having (2) benefit from all the recent improvements to (1).
KershawChang pushed a commit to KershawChang/neqo that referenced this pull request May 7, 2024
There are two server implementations based on neqo:

1. https://github.com/mozilla/neqo/tree/main/neqo-bin/src/server
  - http3 and http09 implementation
  - used for manual testing and QUIC Interop

2. https://searchfox.org/mozilla-central/source/netwerk/test/http3server/src/main.rs
  - used to test Firefox

I assume one was once an exact copy of the other. Both implement their own I/O,
event loop, ... Since then, the two implementations diverged significantly.
Especially (1) saw a lot of improvements in recent months:

- mozilla#1564
- mozilla#1569
- mozilla#1578
- mozilla#1581
- mozilla#1604
- mozilla#1612
- mozilla#1676
- mozilla#1692
- mozilla#1707
- mozilla#1708
- mozilla#1727
- mozilla#1753
- mozilla#1756
- mozilla#1766
- mozilla#1772
- mozilla#1786
- mozilla#1787
- mozilla#1788
- mozilla#1794
- mozilla#1806
- mozilla#1808
- mozilla#1848
- mozilla#1866

At this point, bugs in (2) are hard to fix, see e.g.
mozilla#1801.

This commit merges (2) into (1), thus removing all duplicate logic and
having (2) benefit from all the recent improvements to (1).
github-merge-queue bot pushed a commit that referenced this pull request May 8, 2024
* refactor(bin): introduce server/http3.rs and server/http09.rs

The QUIC Interop Runner requires an http3 and http09 implementation for both
client and server. The client code is already structured into an http3 and an
http09 implementation since #1727.

This commit does the same for the server side, i.e. splits the http3 and http09
implementation into separate Rust modules.

* refactor: merge mozilla-central http3 server into neqo-bin

There are two server implementations based on neqo:

1. https://github.com/mozilla/neqo/tree/main/neqo-bin/src/server
  - http3 and http09 implementation
  - used for manual testing and QUIC Interop

2. https://searchfox.org/mozilla-central/source/netwerk/test/http3server/src/main.rs
  - used to test Firefox

I assume one was once an exact copy of the other. Both implement their own I/O,
event loop, ... Since then, the two implementations diverged significantly.
Especially (1) saw a lot of improvements in recent months:

- #1564
- #1569
- #1578
- #1581
- #1604
- #1612
- #1676
- #1692
- #1707
- #1708
- #1727
- #1753
- #1756
- #1766
- #1772
- #1786
- #1787
- #1788
- #1794
- #1806
- #1808
- #1848
- #1866

At this point, bugs in (2) are hard to fix, see e.g.
#1801.

This commit merges (2) into (1), thus removing all duplicate logic and
having (2) benefit from all the recent improvements to (1).

* Move firefox.rs to mozilla-central

* Reduce HttpServer trait functions

* Extract constructor

* Remove unused deps

* Remove clap color feature

Nice to have. Adds multiple dependencies. Hard to justify for mozilla-central.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants