Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

quic: add QUIC downstream connection close error stats. #16584

Merged
merged 18 commits into from
Jun 3, 2021
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 19 additions & 2 deletions docs/root/configuration/http/http_conn_man/stats.rst
Original file line number Diff line number Diff line change
Expand Up @@ -87,7 +87,12 @@ the following statistics:
Per listener statistics
-----------------------

Additional per listener statistics are rooted at *listener.<address>.http.<stat_prefix>.* with the
Per listener statistics are rooted at *listener.<address>.

Http per listener statistics
RenjieTang marked this conversation as resolved.
Show resolved Hide resolved
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Additional HTTP statistics are of the form http.<stat_prefix>.* with the
following statistics:

.. csv-table::
Expand All @@ -101,12 +106,24 @@ following statistics:
downstream_rq_4xx, Counter, Total 4xx responses
downstream_rq_5xx, Counter, Total 5xx responses

Http3 per listener statistics
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Http3 statistics with the form of http3.<stat_prefix>.*:

.. csv-table::
:header: Name, Type, Description
:widths: 1, 1, 2

[direction].[source].quic_connection_close_error_code_[error_code], Counter, A collection of counters that are lazily initialized to record each quic connection close error code that's present. direction could be *upstream* or *downstream*. source could be *self* or *peer*.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're doccing up listener stats, won't direction always be downstream?

And actually I hate to catch this now, but probably rather than self/peer it should be tx/rx for consistency with other Envoy stats. I find self and peer more intuitive (maybe 'cause that's what I'm used to) but I think most of the other stats (like resets tx_reset/rx_reset) are documented that way with tx for things sent to the peer and rx being codes received from the peer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. The listener stats should be all downstreams.

I haven't figured out the details on upstream stats. So I will update the doc again when I add them.

Done changing names to tx/rx.



.. _config_http_conn_man_stats_per_codec:

Per codec statistics
-----------------------

Each codec has the option of adding per-codec statistics. Both http1 and http2 have codec stats.
Each codec has the option of adding per-codec statistics. http1, http2, and http3 all have codec stats.

Http1 codec statistics
~~~~~~~~~~~~~~~~~~~~~~
Expand Down
12 changes: 9 additions & 3 deletions source/common/quic/quic_stat_names.cc
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,18 @@
namespace Envoy {
namespace Quic {

// TODO(renjietang): Currently these stats are only available in downstream. Wire it up to upstream
// QUIC also.
Comment on lines +6 to +7
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a quick read through it's not clear to me that this won't create an unbounded number of stats controlled by downstream (for exampled if the name somehow has the code number in it and the client controls the codes). This is probably not the case, but can you add more comments about why this is safe for use with uncontrolled peers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great point!
I initially had an assertion in QuicStatNames::connectionCloseStatName() to make sure we don't create unlimited amount of stats.
But after some investigation, I found that if the connection close is initiated from peers, the error_code is parsed from the wire and no enum range checking is done. So an assertion here is too strong and malicious clients might be able to attack Envoy by sending out-of-range error codes.

Instead, I now added a check in QuicStatNames::chargeQuicConnectionCloseStats() to ignore out-of-range error_codes and log a warning.

I will follow up to dig deeper into the quiche code and see if bad error code handling can be done earlier.

QuicStatNames::QuicStatNames(Stats::SymbolTable& symbol_table)
: stat_name_pool_(symbol_table), symbol_table_(symbol_table),
downstream_(stat_name_pool_.add("downstream")), upstream_(stat_name_pool_.add("upstream")),
from_self_(stat_name_pool_.add("self")), from_peer_(stat_name_pool_.add("peer")) {
http3_prefix_(stat_name_pool_.add("http3")), downstream_(stat_name_pool_.add("downstream")),
upstream_(stat_name_pool_.add("upstream")), from_self_(stat_name_pool_.add("self")),
from_peer_(stat_name_pool_.add("peer")) {
// Preallocate most used counters
// Most popular in client initiated connection close.
connectionCloseStatName(quic::QUIC_NETWORK_IDLE_TIMEOUT);
RenjieTang marked this conversation as resolved.
Show resolved Hide resolved
// Most popular in server initiated connection close.
connectionCloseStatName(quic::QUIC_SILENT_IDLE_TIMEOUT);
}

void QuicStatNames::incCounter(Stats::Scope& scope, const Stats::StatNameVec& names) {
Expand All @@ -23,7 +29,7 @@ void QuicStatNames::chargeQuicConnectionCloseStats(Stats::Scope& scope,
ASSERT(&symbol_table_ == &scope.symbolTable());

const Stats::StatName connection_close = connectionCloseStatName(error_code);
incCounter(scope, {is_upstream ? upstream_ : downstream_,
incCounter(scope, {http3_prefix_, is_upstream ? upstream_ : downstream_,
source == quic::ConnectionCloseSource::FROM_SELF ? from_self_ : from_peer_,
connection_close});
}
Expand Down
3 changes: 3 additions & 0 deletions source/common/quic/quic_stat_names.h
Original file line number Diff line number Diff line change
Expand Up @@ -20,12 +20,15 @@ class QuicStatNames {
quic::ConnectionCloseSource source, bool is_upstream);

private:
// Find the actual counter in |scope| and increment it.
// An example counter name: "downstream.self.quic_connection_close_error_code_QUIC_NO_ERROR".
void incCounter(Stats::Scope& scope, const Stats::StatNameVec& names);

Stats::StatName connectionCloseStatName(quic::QuicErrorCode error_code);

Stats::StatNamePool stat_name_pool_;
Stats::SymbolTable& symbol_table_;
const Stats::StatName http3_prefix_;
const Stats::StatName downstream_;
const Stats::StatName upstream_;
alyssawilk marked this conversation as resolved.
Show resolved Hide resolved
const Stats::StatName from_self_;
Expand Down
5 changes: 3 additions & 2 deletions test/common/quic/quic_stat_names_test.cc
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,9 @@ class QuicStatNamesTest : public testing::Test {
TEST_F(QuicStatNamesTest, QuicConnectionCloseStats) {
alyssawilk marked this conversation as resolved.
Show resolved Hide resolved
quic_stat_names_.chargeQuicConnectionCloseStats(scope_, quic::QUIC_NO_ERROR,
quic::ConnectionCloseSource::FROM_SELF, false);
EXPECT_EQ(
1U, scope_.counter("downstream.self.quic_connection_close_error_code_QUIC_NO_ERROR").value());
EXPECT_EQ(1U,
scope_.counter("http3.downstream.self.quic_connection_close_error_code_QUIC_NO_ERROR")
.value());
}

} // namespace Quic
Expand Down
12 changes: 12 additions & 0 deletions test/integration/quic_http_integration_test.cc
Original file line number Diff line number Diff line change
Expand Up @@ -313,6 +313,18 @@ TEST_P(QuicHttpIntegrationTest, ZeroRtt) {
->EarlyDataAccepted());
// Close the second connection.
codec_client_->close();
if (GetParam().first == Network::Address::IpVersion::v4) {
EXPECT_EQ(
2u, test_server_
->counter("listener.127.0.0.1_0.http3.downstream.peer.quic_connection_close_error_"
"code_QUIC_NO_ERROR")
->value());
} else {
EXPECT_EQ(2u, test_server_
->counter("listener.[__1]_0.http3.downstream.peer.quic_connection_close_"
"error_code_QUIC_NO_ERROR")
->value());
}
}

// Ensure multiple quic connections work, regardless of platform BPF support
Expand Down