Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Order of responses is not guaranteed on ADS stream #218

Closed
gnz00 opened this issue Aug 22, 2019 · 9 comments
Closed

Order of responses is not guaranteed on ADS stream #218

gnz00 opened this issue Aug 22, 2019 · 9 comments
Labels

Comments

@gnz00
Copy link

gnz00 commented Aug 22, 2019

It is impossible in the current implementation to ensure the order of responses on an ADS stream. I believe it is due to the un-ordered nature of select in server.go: https://github.com/envoyproxy/go-control-plane/blob/master/pkg/server/server.go#L198-L248.

Is there a known solution to this? Currently, I'm introducing an artificial delay to ensure the select sends the responses in the proper order. I would like to have the server buffer a batch of responses and transactionally send them, waiting for an ACK between each response. This should be doable using the Cache and Callbacks implementations, but there might be a better way to do this through a callback mechanism in the server implementation.

@kyessenov
Copy link
Contributor

This is mostly because of the order of requests is not guaranteed. E.g. if listeners have 5 route tables, then Envoy is going to ask for 1,2,3,4, and 5 tables in a sequence.

The plan originally was to make each snapshot transactionally consistent, wait until the entire snapshot is acked, and then make progress towards the next snapshot. This also allows safe rollouts since you can introduce a union snapshot in the middle than can be backtracked safely: e.g. snapshot S1, then S1 union S2, then S2. It also supports the incremental xDS which hopefully becomes a first class xDS in Envoy.

@gnz00
Copy link
Author

gnz00 commented Aug 22, 2019 via email

@kyessenov
Copy link
Contributor

kyessenov commented Sep 3, 2019

Incremental is still being implemented, and it's probably usable. I haven't seen a production use of incremental xDS though, so please confirm with envoy developers whether it's ok to use.

ADS is a mechanism to enable transactional config update, but it doesn't give it for free. All it does is serialize all updates on a single pipe, and it's up to the server to send config so that it can support rollbacks. The union idea I mentioned is one such approach, since CDS/LDS are total sets and removing elements during the update from the set will cause a brief down-time (404s). So you have to have a state that has both old and new elements in the middle.

The transactional config update is not implemented in this repo. It's mostly due to lack of hands, and the perceived complexity of the solution on the management server which would have to coordinate all xDSes. There's been some improvements in Envoy-side to provide "warming" of clusters and listeners, so that config is stalled until all dependent resources are provided (search for warming in xDS). This enables an eventually consistent model, but also opens a possibility of locking the proxies that ask for deleted RDS/EDS.

@kyessenov
Copy link
Contributor

Incremental xDS has finally merged into envoy, so it should be possible to push delta updates once that gets implemented here.

@gnz00
Copy link
Author

gnz00 commented Oct 5, 2019 via email

@github-actions
Copy link

github-actions bot commented Apr 6, 2021

This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or "no stalebot" or other activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale label Apr 6, 2021
@github-actions
Copy link

This issue has been automatically closed because it has not had activity in the last 37 days. If this issue is still valid, please ping a maintainer and ask them to label it as "help wanted" or "no stalebot". Thank you for your contributions.

@AmitKatyal-Sophos
Copy link
Contributor

@gnz00 Could you please share the details of your control plane repo which handles the ordering.

@gnz00
Copy link
Author

gnz00 commented Jun 7, 2022

@gnz00 Could you please share the details of your control plane repo which handles the ordering.

My apologies, I no longer have access to the source. It wasn't very well abstracted from our service discovery mechanism. I believe with Delta support you might not need to use ADS any more - or you could implement the single channel mechanism as described above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants