Skip to content

Commit

Permalink
PmTLS and tproxy improvements with failover and L7 traffic mgmt for k…
Browse files Browse the repository at this point in the history
…8s (#17624)

* porting over changes from enterprise repo to oss

* applied feedback on service mesh for k8s overview

* fixed typo

* removed ent-only build script file

* Apply suggestions from code review

Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: David Yu <dyu@hashicorp.com>
Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com>

---------

Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com>
Co-authored-by: David Yu <dyu@hashicorp.com>
  • Loading branch information
3 people authored Jun 10, 2023
1 parent ec347ef commit 5e84674
Show file tree
Hide file tree
Showing 9 changed files with 795 additions and 196 deletions.
45 changes: 45 additions & 0 deletions website/content/docs/connect/failover/index.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
---
layout: docs
page_title: Failover configuration overview
description: Learn about failover strategies and service mesh features you can implement to route traffic if services become unhealthy or unreachable, including sameness groups, prepared queries, and service resolvers.
---

# Failover overview

Services in your mesh may become unhealthy or unreachable for many reasons, but you can mitigate some of the effects associated with infrastructure issues by configuring Consul to automatically route traffic to and from failover service instances. This topic provides an overview of the failover strategies you can implement with Consul.

## Service failover strategies in Consul

There are several methods for implementing failover strategies between datacenters in Consul. You can adopt one of the following strategies based on your deployment configuration and network requirements:

- Configure the `Failover` stanza in a service resolver configuration entry to explicitly define which services should failover and the targeting logic they should follow.
- Make a prepared query for each service that you can use to automate geo-failover.
- Create a sameness group to identify partitions with identical namespaces and service names to establish default failover targets.

The following table compares these strategies in deployments with multiple datacenters to help you determine the best approach for your service:

| Failover Strategy | Supports WAN Federation | Supports Cluster Peering | Multi-Datacenter Failover Strength | Multi-Datacenter Usage Scenario |
| :---------------: | :---------------------: | :----------------------: | :--------------------------------- | :------------------------------ |
| `Failover` stanza | &#9989; | &#9989; | Enables more granular logic for failover targeting | Configuring failover for a single service or service subset, especially for testing or debugging purposes |
| Prepared query | &#9989; | &#9989; | Central policies that can automatically target the nearest datacenter | WAN-federated deployments where a primary datacenter is configured. Prepared queries are not replicated over peer connections. |
| Sameness groups | &#10060; | &#9989; | Group size changes without edits to existing member configurations | Cluster peering deployments with consistently named services and namespaces |

### Failover configurations for a service mesh with a single datacenter

You can implement a service resolver configuration entry and specify a pool of failover service instances that other services can exchange messages with when the primary service becomes unhealthy or unreachable. We recommend adopting this strategy as a minimum baseline when implementing Consul service mesh and layering additional failover strategies to build resilience into your application network.

Refer to the [`Failover` configuration ](/consul/docs/connect/config-entries/service-resolver#failover) for examples of how to configure failover services in the service resolver configuration entry on both VMs and Kubernetes deployments.

### Failover configuration for WAN-federated datacenters

If your network has multiple Consul datacenters that are WAN-federated, you can configure your applications to look for failover services with prepared queries. [Prepared queries](/consul/api-docs/) are configurations that enable you to define complex service discovery lookups. This strategy hinges on the secondary datacenter containing service instances that have the same name and residing in the same namespace as their counterparts in the primary datacenter.

Refer to the [Automate geo-failover with prepared queries tutorial](/consul/tutorials/developer-discovery/automate-geo-failover) for additional information.

### Failover configuration for peered clusters and partitions

In networks with multiple datacenters or partitions that share a peer connection, each datacenter or partition functions as an independent unit. As a result, Consul does not correlate services that have the same name, even if they are in the same namespace.

You can configure sameness groups for this type of network. Sameness groups allow you to define a group of admin partitions where identical services are deployed in identical namespaces. After you configure the sameness group, you can reference the `SamenessGroup` parameter in service resolver, exported service, and service intention configuration entries, enabling you to add or remove cluster peers from the group without making changes to every cluster peer every time.

Refer to [Sameness groups usage page](/consul/docs/connect/cluster-peering/usage/sameness-groups) for more information.
132 changes: 43 additions & 89 deletions website/content/docs/connect/l7-traffic/index.mdx
Original file line number Diff line number Diff line change
@@ -1,126 +1,80 @@
---
layout: docs
page_title: Service Mesh Traffic Management - Overview
page_title: Service mesh traffic management overview
description: >-
Consul can route, split, and resolve Layer 7 traffic in a service mesh to support workflows like canary testing and blue/green deployments. Learn about the three configuration entry kinds that define L7 traffic management behavior in Consul.
---

-> **1.6.0+:** This feature is available in Consul versions 1.6.0 and newer.
# Service mesh traffic management overview

# Service Mesh Traffic Management Overview
This topic provides overview information about the application layer traffic management capabilities available in Consul service mesh. These capabilities are also referred to as *Layer 7* or *L7 traffic management*.

Layer 7 traffic management allows operators to divide L7 traffic between
different
[subsets](/consul/docs/connect/config-entries/service-resolver#service-subsets) of
service instances when using service mesh.
## Introduction

There are many ways you may wish to carve up a single datacenter's pool of
services beyond simply returning all healthy instances for load balancing.
Canary testing, A/B tests, blue/green deploys, and soft multi-tenancy
(prod/qa/staging sharing compute resources) all require some mechanism of
carving out portions of the Consul catalog smaller than the level of a single
service and configuring when that subset should receive traffic.
Consul service mesh allows you to divide application layer traffic between different subsets of service instances. You can leverage L7 traffic management capabilities to perform complex processes, such as configuring backup services for failover scenarios, canary and A-B testing, blue-green deployments, and soft multi-tenancy in which production, QA, and staging environments share compute resources. L7 traffic management with Consul service mesh allows you to designate groups of service instances in the Consul catalog smaller than all instances of single service and configure when that subset should receive traffic.

-> **Note:** This feature is not compatible with the
[built-in proxy](/consul/docs/connect/proxies/built-in),
[native proxies](/consul/docs/connect/native),
and some [Envoy proxy escape hatches](/consul/docs/connect/proxies/envoy#escape-hatch-overrides).
You cannot manage L7 traffic with the [built-in proxy](/consul/docs/connect/proxies/built-in),
[native proxies](/consul/docs/connect/native), or some [Envoy proxy escape hatches](/consul/docs/connect/proxies/envoy#escape-hatch-overrides).

## Stages
## Discovery chain

Service mesh proxy upstreams are discovered using a series of stages: routing,
splitting, and resolution. These stages represent different ways of managing L7
traffic.
Consul uses a series of stages to discover service mesh proxy upstreams. Each stage represents different ways of managing L7 traffic. They are referred to as the _discovery chain_:

![screenshot of L7 traffic visualization in the UI](/img/l7-routing/full.png)

Each stage of this discovery process can be dynamically reconfigured via various
[configuration entries](/consul/docs/agent/config-entries). When a configuration
entry is missing, that stage will fall back on reasonable default behavior.

### Routing

A [`service-router`](/consul/docs/connect/config-entries/service-router) config
entry kind is the first configurable stage.
- routing
- splitting
- resolution

![screenshot of service router in the UI](/img/l7-routing/Router.png)

A router config entry allows for a user to intercept traffic using L7 criteria
such as path prefixes or http headers, and change behavior such as by sending
traffic to a different service or service subset.

These config entries may only reference `service-splitter` or
`service-resolver` entries.

[Examples](/consul/docs/connect/config-entries/service-router#sample-config-entries)
can be found in the `service-router` documentation.

### Splitting
For information about integrating service mesh proxy upstream discovery using the discovery chain, refer to [Discovery Chain for Service Mesh Traffic Management](/consul/docs/connect/l7-traffic/discovery-chain).

A [`service-splitter`](/consul/docs/connect/config-entries/service-splitter) config
entry kind is the next stage after routing.
The Consul UI shows discovery chain stages in the **Routing** tab of the **Services** page:

![screenshot of service splitter in the UI](/img/l7-routing/Splitter.png)
![screenshot of L7 traffic visualization in the UI](/img/l7-routing/full.png)

A splitter config entry allows for a user to choose to split incoming requests
across different subsets of a single service (like during staged canary
rollouts), or perhaps across different services (like during a v2 rewrite or
other type of codebase migration).
You can define how Consul manages each stage of the discovery chain in a Consul _configuration entry_. [Configuration entries](/consul/docs/connect/config-entries) modify the default behavior of the Consul service mesh.

These config entries may only reference `service-splitter` or
`service-resolver` entries.
When managing L7 traffic with cluster peering, there are additional configuration requirements to resolve peers in the discovery chain. Refer to [Cluster peering L7 traffic management](/consul/docs/connect/cluster-peering/usage/peering-traffic-management) for more information.

If one splitter references another splitter the overall effects are flattened
into one effective splitter config entry which reflects the multiplicative
union. For instance:
### Routing

splitter[A]: A_v1=50%, A_v2=50%
splitter[B]: A=50%, B=50%
---------------------
splitter[effective_B]: A_v1=25%, A_v2=25%, B=50%
The first stage of the discovery chain is the service router. Routers intercept traffic according to a set of L7 attributes, such as path prefixes and HTTP headers, and route the traffic to a different service or service subset.

[Examples](/consul/docs/connect/config-entries/service-splitter#sample-config-entries)
can be found in the `service-splitter` documentation.
Apply a [service router configuration entry](/consul/docs/connect/config-entries/service-router) to implement a router. Service router configuration entries can only reference service splitter or service resolver configuration entries.

### Resolution
![screenshot of service router in the UI](/img/l7-routing/Router.png)

A [`service-resolver`](/consul/docs/connect/config-entries/service-resolver) config
entry kind is the last stage.
### Splitting

![screenshot of service resolver in the UI](/img/l7-routing/Resolver.png)
The second stage of the discovery chain is the service splitter. Service splitters split incoming requests and route them to different services or service subsets. Splitters enable staged canary rollouts, versioned releases, and similar use cases.

A resolver config entry allows for a user to define which instances of a
service should satisfy discovery requests for the provided name.
Apply a [service splitter configuration entry](/consul/docs/connect/config-entries/service-splitter) to implement a splitter. Service splitters configuration entries can only reference other service splitters or service resolver configuration entries.

Examples of things you can do with resolver config entries:
![screenshot of service splitter in the UI](/img/l7-routing/Splitter.png)

- Control where to send traffic if all instances of `api` in the current
datacenter are unhealthy.
If multiple service splitters are chained, Consul flattens the splits so that they behave as a single service spitter. In the following equation, `splitter[A]` references `splitter[B]`:

- Configure service subsets based on `Service.Meta.version` values.
```text
splitter[A]: A_v1=50%, A_v2=50%
splitter[B]: A=50%, B=50%
---------------------
splitter[effective_B]: A_v1=25%, A_v2=25%, B=50%
```

- Send all traffic for `web` that does not specify a service subset to the
`version1` subset.

- Send all traffic for `api` to `new-api`.
### Resolution

- Send all traffic for `api` in all datacenters to instances of `api` in `dc2`.
The third stage of the discovery chain is the service resolver. Service resolvers specify which instances of a service satisfy discovery requests for the provided service name. Service resolvers enable several use cases, including:

- Create a "virtual service" `api-dc2` that sends traffic to instances of `api`
in `dc2`. This can be referenced in upstreams or in other config entries.
- Designate failovers when service instances become unhealthy or unreachable.
- Configure service subsets based on DNS values.
- Route traffic to the latest version of a service.
- Route traffic to specific Consul datacenters.
- Create virtual services that route traffic to instances of the actual service in specific Consul datacenters.

If no resolver config is defined for a service it is assumed 100% of traffic
flows to the healthy instances of a service with the same name in the current
datacenter/namespace and discovery terminates.
Apply a [service resolver configuration entry](/consul/docs/connect/config-entries/service-resolver) to implement a resolver. Service resolver configuration entries can only reference other service resolvers.

This should feel similar in spirit to various uses of Prepared Queries, but is
not intended to be a drop-in replacement currently.

These config entries may only reference other `service-resolver` entries.
![screenshot of service resolver in the UI](/img/l7-routing/Resolver.png)

[Examples](/consul/docs/connect/config-entries/service-resolver#sample-config-entries)
can be found in the `service-resolver` documentation.
If no resolver is configured for a service, Consul sends all traffic to healthy instances of the service that have the same name in the current datacenter or specified namespace and ends the discovery chain.

-> **Note:** `service-resolver` config entries kinds can function at L4 (unlike
`service-router` and `service-splitter` kinds). These can be created for
services of any protocol such as `tcp`.
Service resolver configuration entries can also process network layer, also called level 4 (L4), traffic. As a result, you can implement service resolvers for services that communicate over `tcp` and other non-HTTP protocols.
Loading

0 comments on commit 5e84674

Please sign in to comment.