Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

network: implement initial heuristic for binding alternate interface #1858

Merged
merged 18 commits into from
Oct 16, 2021

Conversation

goaway
Copy link
Contributor

@goaway goaway commented Oct 4, 2021

Description: Introduces an initial heuristic for attempting a connection using a socket with a bound interface. The heuristic is entirely contained within reportNetworkUsage(...), and as such is straightforward to iterate and experiment upon. In support of the heuristic, current network/configuration state is identified with a configuration_key that prevents stale information from influencing internal tracking. Current network state is stored and updated automatically. This allows static accessors to begin to report OS updates during Envoy's startup period, and ensures OS updates have a timely effect on new requests.

The initial heuristic is as follows:

For both WiFi and cellular network types, there is a default socket configuration that does not use a bound socket, and an alternative "override" configuration that will attempt to use an interface associated with the opposite network type (a cellular-bound socket when the preferred network type is WiFi, and a WiFi-bound socket when network type is cellular).

A "network fault" is defined as a request that terminates in error while having received no upstream bytes.
If a connection has never handled a request without a fault, it is allowed only one fault before the alternative will be tried.
If a connection has ever handled a request without a fault, it is allowed up to three consecutive faults before the alternative will be tried.

The above behavior is disabled by default, and may be enabled by settings enableInterfaceBinding(true) on the public EngineBuilders.

Risk Level: Moderate
Testing: Manual and new/updated coverage

Signed-off-by: Mike Schore mike.schore@gmail.com

Base automatically changed from ms/interface-binding-3 to main October 11, 2021 19:31
Signed-off-by: Mike Schore <mike.schore@gmail.com>
@goaway goaway marked this pull request as ready for review October 12, 2021 11:35
@goaway
Copy link
Contributor Author

goaway commented Oct 12, 2021

Marking ready for review, but with some caveats:

  • Additional testing is needed; either here or in a follow-up.
  • The AuxStreamInfo should probably switch to one of the built-in metadata mechanisms in StreamInfo, which I missed when looking for someplace to put these values initially.

@jpsim
Copy link
Contributor

jpsim commented Oct 12, 2021

Is there somewhere I could read up on what issue this change is attempting to address/improve?

Signed-off-by: Mike Schore <mike.schore@gmail.com>
Signed-off-by: Mike Schore <mike.schore@gmail.com>
Signed-off-by: Mike Schore <mike.schore@gmail.com>
Signed-off-by: Mike Schore <mike.schore@gmail.com>
Signed-off-by: Mike Schore <mike.schore@gmail.com>
Signed-off-by: Mike Schore <mike.schore@gmail.com>
@mattklein123 mattklein123 self-assigned this Oct 13, 2021
@goaway
Copy link
Contributor Author

goaway commented Oct 13, 2021

cc @Reflejo

Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice to see this coming together and excited to see this land. Flushing out my first pass of comments, thanks.

Signed-off-by: Mike Schore <mike.schore@gmail.com>
Signed-off-by: Mike Schore <mike.schore@gmail.com>
Signed-off-by: Mike Schore <mike.schore@gmail.com>
Signed-off-by: Mike Schore <mike.schore@gmail.com>
Signed-off-by: Mike Schore <mike.schore@gmail.com>
Signed-off-by: Mike Schore <mike.schore@gmail.com>
@goaway
Copy link
Contributor Author

goaway commented Oct 15, 2021

@mattklein123 I resolved your comments as I addressed them (hopefully you don't mind), since I like to use the GH facility as a checklist of sorts during large updates. Feel free to re-open any of them though if you feel they need further discussion/work.

Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks overall looks great. Flushing some more comments from another pass.

return Upstream::RetryOptionsPredicate::UpdateOptionsReturn{absl::nullopt};
}

bool fault = !stream_info.firstUpstreamRxByteReceived().has_value();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps move this and the filter fault calculation into a shared static utility so it's a bit easier to comment above why we are doing various things?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added comments to the two locations where we carry out this check (one in the filter, and one in the predicate), but opted not to factor into a shared utility just yet. If this check evolves to become more complicated, and/or we end up calling it from a third place, I'd definitely switch to a shared utility.

Comment on lines 39 to 40
* toggle the boolean. For the purpose of this heuristic, a fault is a request that terminates
* having received 0 transmitted bytes, and a success is anything else.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: move this comment to wherever you put the shared calculation logic per my other comments.

library/common/network/configurator.h Outdated Show resolved Hide resolved
library/common/network/configurator.cc Outdated Show resolved Hide resolved
library/common/network/configurator.cc Outdated Show resolved Hide resolved
library/common/network/configurator.cc Outdated Show resolved Hide resolved
library/common/network/configurator.cc Show resolved Hide resolved
Signed-off-by: Mike Schore <mike.schore@gmail.com>
Signed-off-by: Mike Schore <mike.schore@gmail.com>
Signed-off-by: Mike Schore <mike.schore@gmail.com>
Signed-off-by: Mike Schore <mike.schore@gmail.com>
@goaway
Copy link
Contributor Author

goaway commented Oct 15, 2021

@mattklein123 thanks for the feedback! Updated again.

Signed-off-by: Mike Schore <mike.schore@gmail.com>
Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome LGTM to ship and iterate!

@buildbreaker
Copy link

Thank you @mattklein123

@buildbreaker buildbreaker merged commit 44586eb into main Oct 16, 2021
@buildbreaker buildbreaker deleted the ms/interface-binding-4 branch October 16, 2021 17:36
jpsim added a commit to jpsim/envoy-mobile that referenced this pull request Oct 21, 2021
* origin/main:
  [Apple] Guess string encoding when creating an NSString with UTF8 fails (envoyproxy#1891)
  Link android dev document in a doctree (envoyproxy#1892)
  bazel: Remove rules_jvm_external dep on JAVA_HOME (envoyproxy#1890)
  release: 0.4.3.20211020 (envoyproxy#1887)
  Add debug instructions and sample bazelproject (envoyproxy#1888)
  bazel: Use hermetic JDK 11 (envoyproxy#1863)
  envoy: bump upstream to c687308 (envoyproxy#1886)
  docs: how to test with local envoy (envoyproxy#1876)
  network: implement initial heuristic for binding alternate interface (envoyproxy#1858)
  Assign an int to each log level (envoyproxy#1885)
  envoy: bump upstream to a5b3af2 (envoyproxy#1884)
  android: stub out jni logging by default (envoyproxy#1879)
  CI: Add local JDK to asan/tsan builds (envoyproxy#1878)
  Make JniBridgeUtility public (envoyproxy#1880)
  swift: Fix Swift version in podspec (envoyproxy#1875)

Signed-off-by: JP Simard <jp@jpsim.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants