-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: xDS mux unification #15473
WIP: xDS mux unification #15473
Conversation
Signed-off-by: Dmitri Dolguikh <ddolguik@redhat.com>
I know you asked to break up the work into smaller chunks, but the only bits I could break out are the factory and interface changes, which are pretty trivial changes. The bulk of work is unified mux implementation + tests (new and updated). All unified mux code is now in
|
I'll try documenting the changes next week. in the meantime ping @howardjohn |
Signed-off-by: Dmitri Dolguikh <ddolguik@redhat.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So looks like the main changes:
- if we send CDS v1, then CDS v2 while v1 is still warming, we don't get an ACK for both - just for v2 once its done warming
- If we send CDS while CDS is paused, we will get 2 duplicate ACKs
?
Signed-off-by: Dmitri Dolguikh <ddolguik@redhat.com>
Signed-off-by: Dmitri Dolguikh <ddolguik@redhat.com>
Signed-off-by: Dmitri Dolguikh <ddolguik@redhat.com>
Signed-off-by: Dmitri Dolguikh <ddolguik@redhat.com>
Signed-off-by: Dmitri Dolguikh <ddolguik@redhat.com>
Signed-off-by: Dmitri Dolguikh <ddolguik@redhat.com>
@dmitri-d @adisuissa is going to take a first pass. |
Signed-off-by: Dmitri Dolguikh <ddolguik@redhat.com>
Signed-off-by: Dmitri Dolguikh <ddolguik@redhat.com>
ping @adisuissa, any chance you'll have time to take a look at this in the near future? |
Ping @adisuissa: any chance you could spend some time on this? I'm concern that changes to current muxes need to be ported here, which in turn makes the review harder... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this!
I’ve done a few iterations over the code, trying to understand what can be done to make sure we are not missing some functionality, and not breaking current functionality.
The runtime feature is a great step to make sure we don’t break anything w/o the ability to go back to things that work.
It's still challenging to see what was added and why, because of the amount of code that was changed, and I'm trying to compare the original to the unified and to the other variant (delta to SotW).
Regarding how to make this PR more digestible, by making a series of small PRs. If we can have incremental changes, each with additional functionality, it would make it easier to understand what was changed and why. An alternative is to add things in the following order: SubscriptionState, GrpcMux, and then GrpcSubscription (these can be split into SotW and delta sub-PRs). If you’ve got other ideas on how to break this down even further this would be really helpful.
Tests: I suggest making the delta and the unified-delta use the same tests, and the sotw and unified-sotw use the same tests (as much as possible). The idea is to make sure any change in one component doesn’t negatively impact the other, and that all tests are written in a single place.
A couple of general things:
- the use of
void*
in some place where it should be Delta/Discovery{Request/Response}. It might be possible to avoid them by using templates in the base class, and specializing it in the derived class. - Some of the code is added to allow the unified variant to work well, and will be removed when the legacy code is removed.This should be documented (add TODOs).
* Passes through to all multiplexed SubscriptionStates. To be called when something | ||
* definitive happens with the initial fetch: either an update is successfully received, | ||
* or some sort of error happened.*/ | ||
virtual void disableInitFetchTimeoutTimer() PURE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that this and the above methods are not implemented in the non-unified versions. If that's the case, would it be possible to create an interface between GrpcMux
and the unified versions that adds these methods?
// understanding of the current protocol state, and new resources that Envoy wants to request. | ||
// Returns a new'd pointer, meant to be owned by the caller, who is expected to know what type the | ||
// pointer actually is. | ||
virtual void* getNextRequestAckless() PURE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if the use of void* can be avoided by using templates in the base class, and specialising it in the derived class.
|
||
void GrpcMuxSotw::establishGrpcStream() { grpc_stream_.establishNewStream(); } | ||
|
||
void GrpcMuxSotw::sendGrpcMessage(void* msg_proto_ptr, SubscriptionState& sub_state) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other than the cast, this function and its GrpcMuxDelta
counterpart look similar. Can this be refactored so the same logic will be in the base class?
return subscriptions_; | ||
} | ||
|
||
// legacy mux interface not implemented by unified mux. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest adding comments to the places that will eventually be removed once the non-unified code is removed.
envoy::config::core::v3::ApiVersion::V3), | ||
callbacks_, std::chrono::milliseconds(0U), dispatcher_) { | ||
state_.updateSubscriptionInterest({"name1", "name2", "name3"}, {}); | ||
auto cur_request = getNextDiscoveryRequestAckless(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: const here and in simialr places
TEST_F(DeltaSubscriptionStateTest, SubscribeAndUnsubscribe) { | ||
{ | ||
state_.updateSubscriptionInterest({"name4"}, {"name1"}); | ||
auto cur_request = getNextDeltaDiscoveryRequestAckless(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: const here and in similar places.
watch_map = | ||
watch_maps_.emplace(type_url, std::make_unique<WatchMap>(use_namespace_matching)).first; | ||
subscriptions_.emplace(type_url, subscription_state_factory_->makeSubscriptionState( | ||
type_url, *watch_maps_[type_url], init_fetch_timeout)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: instead of calling *watch_maps_[type_url]
you should be able to save the result of adding to watch_maps_
above, and take the second element.
return std::make_unique<Cleanup>([this, type_urls]() { | ||
for (const auto& type_url : type_urls) { | ||
pausable_ack_queue_.resume(type_url); | ||
trySendDiscoveryRequests(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC, previously this issued a request just for the resumed type_url
, but now it will send discovery requests for all type_url that are available.
I think querying & sending to a specific type_url makes sense, but the alternative should probably be to call trySendDiscoveryRequests()
after the for loop.
type_url); | ||
return; | ||
} | ||
pausable_ack_queue_.push(sub->second->handleResponse(response_proto_ptr)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The returned UpdateAck
should probably be moved (std::move
) to the queue.
std::unique_ptr<SubscriptionStateFactory> subscription_state_factory_; | ||
|
||
// Map key is type_url. | ||
// Only addWatch() should insert into these maps. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make more sense to have a single map between the type_url to both the WatchMapPtr and SubscriptionStatePtr (so they will always be in sync)?
This pull request has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions! |
This pull request has been automatically closed because it has not had activity in the last 37 days. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions! |
Sorry, I've been OOO, will take a look in the next couple of days. |
Split out subscription state out of xDS mux unification PR (#15473). Made base subscription state class a template. Updated existing delta state tests to work with both legacy and new implementations. Risk Level: low, the code is not being used atm Testing: updated existing tests, added new ones Signed-off-by: Dmitri Dolguikh <ddolguik@redhat.com>
Split out subscription state out of xDS mux unification PR (envoyproxy#15473). Made base subscription state class a template. Updated existing delta state tests to work with both legacy and new implementations. Risk Level: low, the code is not being used atm Testing: updated existing tests, added new ones Signed-off-by: Dmitri Dolguikh <ddolguik@redhat.com>
Fixes #11477, reintroduces changes in #8974. and supersedes work in #14496.
envoy.reloadable_features.unified_mux
runtime flag set to true.unified
prepended to their names (test/common/config/unified_grpc_mux_impl_test.cc is one of those)TODOs:
Signed-off-by: Dmitri Dolguikh ddolguik@redhat.com