Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Un-deprecate bind_to_port and use_original_dst. #5355

Closed
PiotrSikora opened this issue Dec 19, 2018 · 36 comments
Closed

Un-deprecate bind_to_port and use_original_dst. #5355

PiotrSikora opened this issue Dec 19, 2018 · 36 comments
Labels
no stalebot Disables stalebot from closing an issue tech debt
Milestone

Comments

@PiotrSikora
Copy link
Contributor

Opening separate issue for visibility.

It's noteworthy that migrating away from non-binding listeners to a single listener with multiple filter chains is going to be painful in highly dynamic systems, because, due to a missing FilterChain Discovery Service (FDS), each update will require draining of the listener. Perhaps it should be a blocker for the removal?

cc @mattklein123 @alyssawilk @htuch

@mattklein123 mattklein123 added tech debt no stalebot Disables stalebot from closing an issue labels Dec 20, 2018
@mattklein123 mattklein123 added this to the 1.10.0 milestone Dec 20, 2018
@htuch
Copy link
Member

htuch commented Dec 24, 2018

@PiotrSikora I'd like to hear someone state that the drain behavior in a real-world situation is going to make FDS necessary before making it a strict blocker. So far, we're just guessing I think? OTOH I totally agree we should do FDS sooner rather than later, this will be needed for on-demand LDS and solves the high churn story.

@PiotrSikora
Copy link
Contributor Author

@htuch yes, it's not a strict blocker (hence the question mark), but I'm a bit concerned about the reliability of constant draining and it's effect on long-lived connections.

This might be a moot point, though, considering that @ggreenway already started working on F(C)DS.

@rshriram
Copy link
Member

rshriram commented Jan 3, 2019

Its not just that though. If we want incremental updates, having a single large listener blob negates the benefit of updating a specific listener.

Also keep in mind all the long lived gRPC streams or persistent DB connections will end up being killed everytime there is a listener update. And listener updates are frequent in K8S

@htuch
Copy link
Member

htuch commented Jan 4, 2019

@rshriram the point of FDS is that the listener blob delivered via LDS is just the top-level configuration; it's relatively stable and never updates. Instead, individual filter chains update, none of which requires a drain.

@PiotrSikora
Copy link
Contributor Author

FYI, @mattklein123, it seems that I might have been a bit too optimistic when I said "late Q1". At this point, it looks that we won't start working on this before FDS is ready (partially to avoid rewriting this logic twice in a short period of time, but mostly because everybody is already committed to other work this quarter). Unfortunately, I don't have any firm dates to offer. @duderino will keep us posted.

@mattklein123 mattklein123 modified the milestones: 1.10.0, 1.11.0 Mar 5, 2019
@htuch
Copy link
Member

htuch commented Mar 22, 2019

FWIW, this also covers the use of use_original_dst, another deprecated field that was slated for upcoming default fatal-on-startup. We need to preserve both until we have Istio migrate to this.

@PiotrSikora @costinm can you provide an updated statement on what time frames we should be intending to drive this deprecation in FTR?

@PiotrSikora
Copy link
Contributor Author

@silentdai is working on this.

@htuch
Copy link
Member

htuch commented Mar 22, 2019

Yep, I'm just wondering if we can get an agreement on which release version we're all comfortable with deprecation taking place in?

@alexburnos
Copy link

We would be comfortable with use_original_dst and bind_to_port to be marked as fatal-by-default on July 1st 2019. Would it be possible to keep these fields exempt from deprecation process until that date?

@costinm
Copy link
Contributor

costinm commented Mar 26, 2019

Jul 2019 would match Istio plans - if 1.2 and 1.3 don't get delayed. Given vacations and history - I would be more comfortable with Dec ( or as late as we can get ).

@alyssawilk
Copy link
Contributor

fatal by default "in july" means it'd actually flip when we cut the September Envoy build, and the code would be safe to remove in 2020, when it was slated by removal in 2018. I think we can do better than that.

I think we should make it fatal-by-default either in this or the next Envoy release (likely the next to allow for tooling), make it super easy for the Istio's Envoy build to override that fatal-by-default (since we're in good communication with you about deprecation/removal timeline but that'd force any lingering users to move over and start the clock for removal) and then when you folks hit 1.2 we can remove the code as quickly as the Envoy policy allows. Does this sound plausible from your end?

alyssawilk added a commit to alyssawilk/envoy that referenced this issue Apr 8, 2019
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
@mattklein123 mattklein123 modified the milestones: 1.11.0, 1.12.0 Jul 3, 2019
PiotrSikora added a commit to PiotrSikora/envoy that referenced this issue Jul 17, 2019
"use_original_dst" and "bind_to_port" are two complementary parts
of the "virtual listeners" feature, and they should be deprecated
together.

However, they are both currently exempted (see: envoyproxy#5355), so revert
the change from envoyproxy#7549.

Signed-off-by: Piotr Sikora <piotrsikora@google.com>
mattklein123 pushed a commit that referenced this issue Jul 18, 2019
"use_original_dst" and "bind_to_port" are two complementary parts
of the "virtual listeners" feature, and they should be deprecated
together.

However, they are both currently exempted (see: #5355), so revert
the change from #7549.

Signed-off-by: Piotr Sikora <piotrsikora@google.com>
TAOXUY pushed a commit to TAOXUY/envoy that referenced this issue Jul 22, 2019
"use_original_dst" and "bind_to_port" are two complementary parts
of the "virtual listeners" feature, and they should be deprecated
together.

However, they are both currently exempted (see: envoyproxy#5355), so revert
the change from envoyproxy#7549.

Signed-off-by: Piotr Sikora <piotrsikora@google.com>
@mattklein123 mattklein123 removed this from the 1.12.0 milestone Oct 10, 2019
@costinm
Copy link
Contributor

costinm commented Dec 10, 2020

I suspect (4) is a major problem - we really only have 2 listeners, one for iptables inbound and one for outbound.
Even if Envoy can still start the listener and ignore filter chains that are broken ( which I don't think happens now), the
error reporting and metrics will be off, meaning users will need to adjust alerts and debugging.

@duderino
Copy link

@howardjohn @PiotrSikora @costinm @htuch could one of you tell @tbarrella how the undeprecation should work? I take it that the 'hidden_envoy_deprecated_' fields should themselves go through a deprecation process?

@howardjohn
Copy link
Contributor

I don't think renaming a field breaks protobuf compatibility (just like adding hidden_envoy_deprecated_ didn't break it in the first place), so it should be fine to just rename it I would guess, without being familiar with Envoy api process.

@ggreenway
Copy link
Contributor

Renaming a field breaks loading the config via yaml. Documentation on envoy processes around this are here: https://github.com/envoyproxy/envoy/blob/master/api/API_VERSIONING.md#backwards-compatibility

@tbarrella
Copy link
Contributor

Thank you, yeah, maybe it was an easy fix but I noticed the field name showed up in Istio code (istio/istio#25159)

So then according to the policy the hidden field should only be removed in v4alpha. If it's important, maybe the non-hidden field could be added in v3 by removing the reserve field here and replacing it with this? This would be kind of confusing though and the newly-added non-hidden field would have a different ID

Similarly, the DeprecatedV1 shouldn't be removed till v4alpha, but are there any approval extra steps needed to add bind_to_port to v3?

@howardjohn
Copy link
Contributor

and the newly-added non-hidden field would have a different ID

I think if you do this you break wire compatibility. We don't want a new field, right? We just want the existing one to no longer be deprecated.

Renaming a field breaks loading the config via yaml

Does this still apply for a field that already requires an explicit runtime override to be able to even configure it? It was already broken in this manner, to change it back would just be to "un-break" it from my point of view

@htuch
Copy link
Member

htuch commented Dec 11, 2020

@howardjohn @costinm thanks for the explanation, super helpful.

I think we can just rename the hidden_ fields; any use of them outside of Envoy is illegal, and you can update their names in the Envoy source. Istio's current use of this is outside-of-contract. Please do this in the v3 API. For the v2 API, I'm not sure if we need to make any changes, since the fields still exist there, they are just marked deprecated.

@tbarrella
Copy link
Contributor

tbarrella commented Dec 11, 2020

Thank you. What about deprecated_v1.bind_to_port? Should that and a new bind_to_port coexist in v3, with both being valid, or with the old option being ignored? deprecated_v1.bind_to_port wasn't marked deprecated in v2 proto, so if I try to remove it in v3, proto_format.sh fix complains

Edit: It seems like bind_to_port should be added with the deprecated_v1 option remaining but marked as deprecated. If either bind_to_port or deprecated_v1.bind_to_port is set to false, that value can be used, since true is the default

@htuch
Copy link
Member

htuch commented Dec 12, 2020

@tbarrella I guess the question is what is the long term story here. Does #5355 (comment) suggest that we're better off long term retaining bind_to_port (i.e. we tried and realized this is really more hassle than its worth full stop) or that we're better off in the short-medium term (i.e. past v2 removal but not indefinitely) with bind_to_port but still plan on having Istio migrate?

@tbarrella
Copy link
Contributor

Hm, I don't really have context on this beyond what @PiotrSikora @howardjohn @costinm already said; they'd probably be able to answer better than me. My understanding though is that bind_to_port should be retained in the long term

htuch pushed a commit that referenced this issue Dec 15, 2020
See #5355 

Risk Level: Low
Testing: Listener unit tests, grep
Docs Changes: Generated documentation for the proto field

Signed-off-by: Taylor Barrella <tabarr@google.com>
@costinm
Copy link
Contributor

costinm commented Dec 15, 2020

We use bind_to_port on the outbound chain - and if we are allowed we'll likely keep it. We are exploring other modes
of interception besides iptables REDIRECT - like raw binding to the IP, which will look exactly like the current config but with
real binding to each IP:port. And it's cleaner/simpler.

On the inbound chain - we will make big changes when we move to BTS, i.e. everything will move to vhosts and routes.

@tbarrella
Copy link
Contributor

@htuch do you think it's fine to proceed with the plan at the end of #5355 (comment)?

@htuch
Copy link
Member

htuch commented Dec 16, 2020

Seems like a reasonable plan, but I'd like to be deliberative about this undeprecation if we live with it long term. Tagging folks who have done significant listener work and are effective owners. @lambdai @mattklein123 @PiotrSikora WDYT?

@tbarrella
Copy link
Contributor

I just learned that another factor is there's an attempt to totally preserve backward compatibility on the wire; it seems the other Istio folks have more context. For example, renaming deprecated_v1 but not changing the message structure or field numbers. I'm not sure how realistic of a goal this is

@lambdai
Copy link
Contributor

lambdai commented Dec 16, 2020

My two cents:
Pros:

  • Flexibility: since 2 level listener is theoretically more powerful in terms of splitting responsibility and listener update granular. This functionality is proved useful and it's supporting istio.

Cons:

  • Simplicity of code flow: Two level listener is error-prone. There is no well defined boundary and test coverage. One 0-day and several bugs were introduced partially because of the complexity of bind_to_port.
  • Simplicity of traffic matcher: Two level listener has two set of matchers, filter chain match the bind_to_port=true listener; find the best bind_to_port=false listener; and another round of filter chain match in the later listener. It's a spaghetti match flow if all of the 3 are used.
  • Scalability: Bind_to_port is designed for tens of listeners. Istio need to support more than this scale.

@mattklein123
Copy link
Member

I'm fine with keeping bind to port right now. I agree it adds some complexity, but it's mostly limited to a single place in the code (connection handler) and the code has already been there for a very long time. We can always deprecate it later if we feel that it becomes truly redundant.

In terms of the actual proto, I would have to look at it myself, but I think it's fine to just add a new field in v3 if that is what it takes.

@htuch
Copy link
Member

htuch commented Dec 17, 2020

Yep, new v3 field, +1 on going ahead with reintroducing.

@tbarrella
Copy link
Contributor

tbarrella commented Dec 17, 2020

Great, thank you all, I'm happy that alignment is clear now!

To summarize and add some context from discussing offline with @howardjohn @costinm,

  • bind_to_port will be added with the deprecated_v1 option remaining but marked as deprecated in v3
  • If either bind_to_port or deprecated_v1.bind_to_port is set to false, this will be interpreted as false, since true is the default

Originally there was a hope (among Istio developers) that wire compatibility could be preserved, based on an assumption that previous deprecation of this field could just be reverted. But it turns out that this field was always "deprecated" ever since it was migrated from json to proto (envoyproxy/data-plane-api#143). This is fine for Istio though as long as the old field isn't removed in v3, and it won't be, since that would be against Envoy's breaking change policy

htuch pushed a commit that referenced this issue Jan 7, 2021
See #5355 (comment) and above for context
Risk Level: Low
Testing: Listener manager unit tests, grep
Docs Changes: Generated documentation for the proto field
Release Notes:
#5355
Deprecated: envoy_v3_api_field_config.listener.v3.Listener.deprecated_v1 (which was already hidden) was deprecated in favor of the new field envoy_v3_api_field_config.listener.v3.Listener.bind_to_port

Signed-off-by: Taylor Barrella <tabarr@google.com>
@tbarrella
Copy link
Contributor

The PRs to undeprecate both parameters have been merged; it sounds like new issues will be created when old code paths need to be cleaned up in the future, so does that mean this issue can be closed now?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
no stalebot Disables stalebot from closing an issue tech debt
Projects
None yet
Development

No branches or pull requests