-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
wasm: removed automatical route refreshment and add a foreign function to clear the route cache #36671
wasm: removed automatical route refreshment and add a foreign function to clear the route cache #36671
Changes from 5 commits
de74d9f
b8c210c
dad1f7b
8080459
5c80bc7
27830ab
13e26ea
1f78fad
1da41e1
8f5eafa
ead4115
87a3a55
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -374,6 +374,19 @@ class Context : public proxy_wasm::ContextBase, | |
Http::HeaderMap* getMap(WasmHeaderMapType type); | ||
const Http::HeaderMap* getConstMap(WasmHeaderMapType type); | ||
|
||
void onHeadersModified(WasmHeaderMapType type) { | ||
if (type == WasmHeaderMapType::RequestHeaders) { | ||
if (!no_automatic_route_refresh_) { | ||
wbpcode marked this conversation as resolved.
Show resolved
Hide resolved
|
||
clearRouteCache(); | ||
} else { | ||
ENVOY_LOG_FIRST_N(critical, 20, | ||
wbpcode marked this conversation as resolved.
Show resolved
Hide resolved
|
||
"Route will no be refreshed automatically when request headers are " | ||
"modified by the wasm plugin. Refresh the route manually by calling the " | ||
"'clear_route_cache' foreign function. "); | ||
} | ||
} | ||
} | ||
|
||
const LocalInfo::LocalInfo* root_local_info_{nullptr}; // set only for root_context. | ||
PluginHandleSharedPtr plugin_handle_{nullptr}; | ||
|
||
|
@@ -450,6 +463,9 @@ class Context : public proxy_wasm::ContextBase, | |
// Filter state prototype declaration. | ||
absl::flat_hash_map<std::string, Filters::Common::Expr::CelStatePrototypeConstPtr> | ||
state_prototypes_; | ||
|
||
const bool no_automatic_route_refresh_{ | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Changing the default behavior will break existing Wasm users (which is "beta" in Istio). I think we need to change the xDS API instead of the runtime key. Runtime key assumes the default behavior will eventually change but we can't be certain in this case. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I do don't want break exist plugins, but I cannot find better chance to bring this back to the expected way. At least for now, the wasm still not production ready. I also doubt if it is good choice to add new API. I think the API should be part of the wasm plugin configuration, not for the host configuration. There are also some possible compromised way:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Per offline discussion, a long-living runtime key is probably better to avoid a churn in the spec. Not refreshing routes with a custom host call is a safer option, until we get better guidance from Proxy-Wasm There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Don't change the default, and instead add xDS configuration option to disable automatic refresh. I don't understand why you insisting on changing the default behavior and breaking existing plugins in the process.
That's a bit of a stretch. People have been using Proxy-Wasm in Envoy in production for the last 4+ years.
It absolutely should be a new xDS API:
To be honest, this is mostly Envoy internal change. However, the upcoming ABI v0.3 will add features (proxy-wasm/spec#55) and ability to list custom functions (proxy-wasm/spec#71), both of which could be used to let plugin know about changed behavior and addition of the custom hostcall. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I am not very sure. May Proxy-Wasm self was production ready for a while. But in the Envoy, basically this is common understanding in our inner discussion (and it is still alpha). I think our recent target is to make it production ready in the Envoy. We just fixed a very very very basic bug #36411. So I persaonlly think may it's never be widely/heavily used until now.
Because implicitly behavior will bring lots complexity in production where multiple different plugins may be used. It require the developers know all the details or it may result in unexpected result. So I think the correct way should be: 1) no implicitly refresh, 2) provide explict host call to do it, 3) provide additional control by the xDS if needed (mostly for security, although I think now it not a big problem, but this may be necessary in future once the wasm is widely used). And considering current situation of wasm support of Envoy, I think this is the last timepoint to change the behavior and bring it to the expected way. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Of course, I also don't want to break exist plugin. So, I think we could provide different behavior for different versions' plugins. For example, only disable the automotical refreh for plugins with ABI version >= 0.3.0. Then all exist plugins won't be effected. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
AFAIK, it's "alpha" only because of the missing docs. There are many people using it in production, but if you know better, then that's fine.
I've just looked at that bug and I'm a bit puzzled by it. There were already tests for buffered and unbuffered responses, so it seems that Envoy behaves differently depending on HTTP version / codec? I'm not sure if that's desirable behavior. Also, did you check when this issue first appeared? Is it possible that it was a regression due to change from nghttp2 to oghttp2? Maybe that would explain why existing users didn't run into it yet?
Yes, that would be more reasonable. However, I still think this should be configurable in xDS. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Envoy never promise that the last chunk will be empty. Even for the HTTP/1.1, the last chunk may be modified by filters and contains data. So, it actually is a bug. But may be the lib transferation changes the possibility of the bug occurs? Not sure. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I think we should make sure the default behavior match most cases and additional xds should only be used to provide additional control. And the ABI version also provide a way let's us to change the default behavior and without break exist plugins. Anyway, it's hard to let everyone happy. Different people has different views. But seems this one could be an acceptable approach for our both. |
||
Runtime::runtimeFeatureEnabled("envoy.reloadable_features.wasm_no_automatic_route_refresh")}; | ||
}; | ||
using ContextSharedPtr = std::shared_ptr<Context>; | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -126,6 +126,14 @@ RegisterForeignFunction registerSetEnvoyFilterStateForeignFunction( | |
return WasmResult::BadArgument; | ||
}); | ||
|
||
RegisterForeignFunction registerClearRouteCacheForeignFunction( | ||
"clear_route_cache", | ||
[](WasmBase&, std::string_view, const std::function<void*(size_t size)>&) -> WasmResult { | ||
auto context = static_cast<Context*>(proxy_wasm::current_context_); | ||
context->clearRouteCache(); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: Is this safe to call at any point in execution? This is untrusted code calling out, we might need to harden it if it's not already done. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I remember I did check in the clearRouteCache self. Will re-check it again. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The clearRouteCache will be forbided after the response headers is processed. But I agree we should harden it continuously to ensure the interface that's provided to extension is robust enough. For example, only allow the head filter to clear the route cache. I will do this with another PR. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think from the separate discussion in proxy-wasm-spec, what we need is a control knob on the filter xDS with several options:
We can split this into several booleans as well (e.g. 3 - could be a separate knob). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Now, this PR removed the automatic route refresh and provide a manual way. This should be good enough as a default way to handle the route cache. New options (3 and 4) could be treated as independent requirement of Envoy-self and be implemented in the future. I think a basic rule is There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @kyessenov yeah, that sounds good, although I'm not sure I understand the difference between 1 and 4. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
seems I didn't got it correctly. 😞 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @PiotrSikora The difference is that 1) clears route only on header operations but 4) should clear routes on any dynamic metadata or other non-HTTP attribute update |
||
return WasmResult::Ok; | ||
}); | ||
|
||
#if defined(WASM_USE_CEL_PARSER) | ||
class ExpressionFactory : public Logger::Loggable<Logger::Id::wasm> { | ||
protected: | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, ext_proc has configuration named
disable_clear_route_cache
, so it might be a good idea to align with that.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It basically should be replaced by
route_cache_action
. And the default behavior is only clear the route cache when the external plugin require it.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wbpcode Can you introduce this to the xDS spec then instead of the runtime flag. Personally, I think it's OK to change the default to be "do nothing", and allow control planes to set it to "ON_HEADER_OPERATION" to preserve the existing semantics. Istio and other consumers can then control the behavior via xDS which is easier for upgrades.
Would that answer your concerns @PiotrSikora ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kyessenov Fine. But let me do it step by step. I will remove runtime flag and only keep the abi version check in this PR. Considering abi 0.3.0 is not ready now, so, this change won't break anything.
And at next one or two PR I will add the new API and let the api shepherds to review the API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kyessenov yes, that's much better, thank you!
Having said that, if we assume that control planes handle the upgrade logic, then I still don't see the benefit of breaking the current behavior and using
DO_NOTHING
instead ofON_HEADER_OPERATION
as the default.