Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add rps_threshold & max_rejection_probability #16742

Merged
merged 8 commits into from
Jun 11, 2021
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ option (udpa.annotations.file_status).package_version_status = ACTIVE;
// [#protodoc-title: Admission Control]
// [#extension: envoy.filters.http.admission_control]

// [#next-free-field: 6]
// [#next-free-field: 8]
message AdmissionControl {
// Default method of specifying what constitutes a successful request. All status codes that
// indicate a successful request must be explicitly specified if not relying on the default
Expand Down Expand Up @@ -91,4 +91,13 @@ message AdmissionControl {
// below this threshold, rejection probability will increase. Any success rate above the threshold
// results in a rejection probability of 0. Defaults to 95%.
config.core.v3.RuntimePercent sr_threshold = 5;

// If the average RPS of the sampling window is below this threshold, the request
// will not be rejected, even if the success rate is lower than sr_threshold.
// Defaults to 0.
config.core.v3.RuntimeUInt32 rps_threshold = 6;

// The probability of rejection will never exceed this value, even if the failure rate is rising.
// Defaults to 80%.
config.core.v3.RuntimePercent max_rejection_probability = 7;
}
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,11 @@ where,
rejection probability will be higher for higher success rates. See `Aggression`_ for a more
detailed explanation.

Note that there are additional parameters that affect the rejection probability:

- *rps_threshold* is a configurable value that when RPS is lower than it, requests will pass through the filter.
- *max_reject_probability* represents the upper limit of the rejection probability.

.. note::
The success rate calculations are performed on a per-thread basis for increased performance. In
addition, the per-thread isolation prevents decreases the blast radius of a single bad connection
Expand Down Expand Up @@ -91,6 +96,12 @@ fields can be overridden via runtime settings.
aggression:
default_value: 1.5
runtime_key: "admission_control.aggression"
rps_threshold:
default_value: 5
runtime_key: "admission_control.rps_threshold"
max_rejection_probability:
default_value: 80.0
runtime_key: "admission_control.max_rejection_probability"
success_criteria:
http_criteria:
http_success_status:
Expand All @@ -110,6 +121,8 @@ The above configuration can be understood as follows:
window.
* HTTP requests are considered successful if they are 1xx, 2xx, 3xx, or a 404.
* gRPC requests are considered successful if they are OK or CANCELLED.
* Requests will never be rejeted from this filter if the RPS is lower than 5.
* Rejection probability will never exceed 80% even if the failure rate is 100%.
Comment on lines +124 to +125
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is true and these values are clamped, can you make this more clear in the API docs? If it's not true can you make it more clear here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't true all the time-- it's just explaining how to read the configuration above. Github cut out this line just above the additions:

The above configuration can be understood as follows:

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah OK sorry, I missed that, thanks.


Statistics
----------
Expand Down

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 0 additions & 1 deletion source/common/runtime/runtime_protos.h
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,6 @@
namespace Envoy {
namespace Runtime {

// TODO(WeavingGao): use for #16392
// Helper class for runtime-derived uint32.
class UInt32 : Logger::Loggable<Logger::Id::runtime> {
public:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,8 @@ using GrpcStatus = Grpc::Status::GrpcStatus;

static constexpr double defaultAggression = 1.0;
static constexpr double defaultSuccessRateThreshold = 95.0;
static constexpr uint32_t defaultRpsThreshold = 0;
static constexpr double defaultMaxRejectionProbability = 80.0;

AdmissionControlFilterConfig::AdmissionControlFilterConfig(
const AdmissionControlProto& proto_config, Runtime::Loader& runtime,
Expand All @@ -45,6 +47,13 @@ AdmissionControlFilterConfig::AdmissionControlFilterConfig(
sr_threshold_(proto_config.has_sr_threshold() ? std::make_unique<Runtime::Percentage>(
proto_config.sr_threshold(), runtime)
: nullptr),
rps_threshold_(proto_config.has_rps_threshold()
? std::make_unique<Runtime::UInt32>(proto_config.rps_threshold(), runtime)
: nullptr),
max_rejection_probability_(proto_config.has_max_rejection_probability()
? std::make_unique<Runtime::Percentage>(
proto_config.max_rejection_probability(), runtime)
: nullptr),
response_evaluator_(std::move(response_evaluator)) {}

double AdmissionControlFilterConfig::aggression() const {
Expand All @@ -56,6 +65,16 @@ double AdmissionControlFilterConfig::successRateThreshold() const {
return std::min<double>(pct, 100.0) / 100.0;
}

uint32_t AdmissionControlFilterConfig::rpsThreshold() const {
return rps_threshold_ ? rps_threshold_->value() : defaultRpsThreshold;
}

double AdmissionControlFilterConfig::maxRejectionProbability() const {
const double ret = max_rejection_probability_ ? max_rejection_probability_->value()
: defaultMaxRejectionProbability;
return ret / 100.0;
}

AdmissionControlFilter::AdmissionControlFilter(AdmissionControlFilterConfigSharedPtr config,
const std::string& stats_prefix)
: config_(std::move(config)), stats_(generateStats(config_->scope(), stats_prefix)),
Expand All @@ -68,6 +87,11 @@ Http::FilterHeadersStatus AdmissionControlFilter::decodeHeaders(Http::RequestHea
return Http::FilterHeadersStatus::Continue;
}

if (config_->getController().averageRps() < config_->rpsThreshold()) {
ENVOY_LOG(debug, "Current rps: {} is below rps_threshold: {}, continue");
return Http::FilterHeadersStatus::Continue;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider throwing a debug log in here.

}

if (shouldRejectRequest()) {
// We do not want to sample requests that we are rejecting, since this taints the measurements
// that should be describing the upstreams. In addition, if we were to record the requests
Expand Down Expand Up @@ -148,6 +172,7 @@ bool AdmissionControlFilter::shouldRejectRequest() const {
if (aggression != 1.0) {
probability = std::pow(probability, 1.0 / aggression);
}
probability = std::min<double>(probability, config_->maxRejectionProbability());

// Choosing an accuracy of 4 significant figures for the probability.
static constexpr uint64_t accuracy = 1e4;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,8 @@ class AdmissionControlFilterConfig {
Stats::Scope& scope() const { return scope_; }
double aggression() const;
double successRateThreshold() const;
uint32_t rpsThreshold() const;
double maxRejectionProbability() const;
ResponseEvaluator& responseEvaluator() const { return *response_evaluator_; }

private:
Expand All @@ -75,6 +77,8 @@ class AdmissionControlFilterConfig {
Runtime::FeatureFlag admission_control_feature_;
std::unique_ptr<Runtime::Double> aggression_;
std::unique_ptr<Runtime::Percentage> sr_threshold_;
std::unique_ptr<Runtime::UInt32> rps_threshold_;
std::unique_ptr<Runtime::Percentage> max_rejection_probability_;
std::shared_ptr<ResponseEvaluator> response_evaluator_;
};

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,16 @@ ThreadLocalControllerImpl::ThreadLocalControllerImpl(TimeSource& time_source,
std::chrono::seconds sampling_window)
: time_source_(time_source), sampling_window_(sampling_window) {}

uint32_t ThreadLocalControllerImpl::averageRps() const {
if (historical_data_.empty() || global_data_.requests == 0) {
return 0;
}
using namespace std::chrono;
auto count_of_seconds = duration_cast<seconds>(ageOfOldestSample()).count();

return count_of_seconds == 0 ? 0 : global_data_.requests / count_of_seconds;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can just do something like below. There's just less going on that way IMO.

Suggested change
using namespace std::chrono;
auto count_of_seconds = duration_cast<seconds>(ageOfOldestSample()).count();
return count_of_seconds == 0 ? 0 : global_data_.requests / count_of_seconds;
using std::chrono::seconds;
seconds secs = std::max(seconds(1), ageOfOldestSample());
return global_data_.requests / secs.count();

}

void ThreadLocalControllerImpl::maybeUpdateHistoricalData() {
// Purge stale samples.
while (!historical_data_.empty() && ageOfOldestSample() >= sampling_window_) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,9 @@ class ThreadLocalController {

// Returns the current number of requests and how many of them are successful.
virtual RequestData requestCounts() PURE;

// return the average RPS across the sampling window
virtual uint32_t averageRps() const PURE;
};

/**
Expand All @@ -63,6 +66,8 @@ class ThreadLocalControllerImpl : public ThreadLocalController,
return global_data_;
}

uint32_t averageRps() const override;

private:
void recordRequest(bool success);

Expand Down
Loading