Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consul service mesh across multiple clusters with kafka #14125

Closed
codex70 opened this issue Aug 10, 2022 · 5 comments
Closed

Consul service mesh across multiple clusters with kafka #14125

codex70 opened this issue Aug 10, 2022 · 5 comments

Comments

@codex70
Copy link

codex70 commented Aug 10, 2022

Overview of the Issue

I am unable to connect to kafka from a second data center using a mesh gateway.

Reproduction Steps

  1. I install consul using helm in 2 separate kubernetes clusters. The 2 clusters are federated with a separate datacenter in each cluster.
  2. Service mesh gateway appears to be working and testservice/testclient works as expected for 'hello world' example.
  3. I install kafka in the first cluster using helm templates. The following pod annotations are used:
    consul.hashicorp.com/connect-service: "kafka",
    consul.hashicorp.com/connect-inject: "true",
    consul.hashicorp.com/connect-service-port: "9094",
    consul.hashicorp.com/transparent-proxy: "true",
    consul.hashicorp.com/enable-metrics: "false",
    consul.hashicorp.com/transparent-proxy-exclude-inbound-ports: "9094",
    consul.hashicorp.com/kubernetes-service: "kafka",
  1. Kafka is setup with a load balancer on the network accessible from both clusters.
  2. I install the same client in both the first and second cluster with the following annotations:
      consul.hashicorp.com/connect-inject: "true"
      consul.hashicorp.com/transparent-proxy: "true"
      consul.hashicorp.com/service-tags: "spring-boot"
      consul.hashicorp.com/service-metrics-path: '/actuator/prometheus'
      consul.hashicorp.com/connect-service-upstreams: 'kafka:9094:dc1'
  1. On both client I attempt to connect using the following connection strings:
            "kafka": {
              "bootstrap-servers": "kafka.service.dc1.consul:9094",
              "producer": {
                    "bootstrap-servers": "kafka.service.dc1.consul:9094"
              },
  1. On the first cluster, it works as expected.
  2. On the second cluster, I get the following error:
Cancelled in-flight API_VERSIONS request with correlation id 0 due to node -1 being disconnected
  1. On the second cluster, if I replace the service name with the IP address of the service load balancer, again it works as expected.
  2. If I replace the service name in the connection string (for either cluster) with localhost again, I get time out errors, however slightly different:
    Connection to node -1 (localhost/127.0.0.1:9094) could not be established. Broker may not be available.

Consul info for both Client and Server

Operating system and Environment details

Kubernetes clusters running in cloud environment

Log Fragments

@codex70
Copy link
Author

codex70 commented Aug 16, 2022

OK, finally figured out a solution after way too long playing with this. For anyone else wanting to deploy kafka inside the service mesh, the solution is to add the following annotaitons:

podAnnotations: {
  consul.hashicorp.com/connect-service: "kafka,kafka-headless",
  consul.hashicorp.com/connect-inject: "true",
  consul.hashicorp.com/connect-service-port: "9094,9093",
  consul.hashicorp.com/transparent-proxy: "false",
  consul.hashicorp.com/enable-metrics: "false",
}

It is also necessary to add the following annotation to the external access service:

      consul.hashicorp.com/service-ignore : "true"

This isn't currently possible with the helm templates for kafka, but the modification is simple enough.

Services then require the following annotations to connect to kafka:

      consul.hashicorp.com/connect-inject: "true"
      consul.hashicorp.com/transparent-proxy: "false"
      consul.hashicorp.com/connect-service-upstreams: "kafka:9094:dc1"

and connecting via localhost to kafka:

            "kafka": {
              "bootstrap-servers": "localhost:9094",
              "producer": {
                "bootstrap-servers": "localhost:9094"
              }
            }

@david-yu
Copy link
Contributor

Thank you for the insight @codex70 I'll pass this along to some folks that are also looking to do the same thing.

@codex70
Copy link
Author

codex70 commented Aug 26, 2022

@david-yu, just to let you know that the bitnami helm charts I'm working with to deploy kafka are open source, so I've made the necessary changes and this has now been added to the latest release.

@david-yu
Copy link
Contributor

@codex70 do you have perhaps a gist we can follow to see how you're setting up Kafka with Consul K8s? I think something like this would be beneficial to share with the community.

@codex70
Copy link
Author

codex70 commented Aug 29, 2022

Hi @david-yu,

The setup for kafka uses the bitnami helm charts, you will need version 18.2.0 or later. I have this working with 3 nodes. Here's a copy of the relevant values file contents.

replicaCount: 3

# SEE DEFAULT VALUES FILE: https://github.com/bitnami/charts/blob/master/bitnami/kafka/values.yaml
podAnnotations: {
  consul.hashicorp.com/connect-service: "kafka,kafka-headless",
  consul.hashicorp.com/connect-inject: "true",
  consul.hashicorp.com/connect-service-port: "9094,9093",
  consul.hashicorp.com/transparent-proxy: "false",
  consul.hashicorp.com/enable-metrics: "false",
}

serviceAccount:
  create: true
rbac:
  create: true

externalAccess:
  enabled: true
  service:
    # NOTE THE ADDITION OF THIS LABEL WHICH STOPS CONSUL TRYING TO ADD THE EXTERNAL SERVICES TO THE SERVICE MESH.
    labels: {
      consul.hashicorp.com/service-ignore : "true"
    }
    type: NodePort
    nodePorts: ['30092', '30093', '30094']
    useHostIPs: true

You will also need to have a service account set up for kafka-headless and intentions for connecting systems. I created the following to help:

service-intentions.yaml

{{- range .Values.intentions }}
apiVersion: consul.hashicorp.com/v1alpha1
kind: ServiceIntentions
metadata:
  name: {{ .name }}
spec:
  destination:
    name: {{ .destination }}
  sources:
    {{- range .sources}}
    - {{- range $key, $value := . }}
          {{ $key }}: {{ $value }}
        {{- end }}
    {{- end }}
---
{{- end }}

serviceaccount.yaml

{{- range .Values.serviceAccounts }}
# Service account for the Elasticsearch service (for ACL enforcement)
apiVersion: v1
kind: ServiceAccount
metadata:
  name: {{ .name }}
---
{{- end }}

with the following added to the values file:

intentions:
  - name: services-to-kafka
    destination: kafka
    sources:
      - name: test-service
        action: allow
      - name: prod-service
        action: allow

serviceAccounts:
  - name: kafka-headless

You also need to be careful to set up firewalls etc. wherever the cluster is deployed.

I now have this working across multiple federated clusters.

As an example, the connection from a spring boot application would look like the following:

          "spring":{
            "application": {
              "name": "test-service"
            },
            "kafka": {
              "bootstrap-servers": "localhost:9094",
              "producer": {
                "bootstrap-servers": "localhost:9094"
              },
              "listener": {
                "listenRequest": {
                  "topic": "simple.request.test-topic",
                  "enabled": "true"
                },
                "listenResponse": {
                  "topic": "simple.response.test-topic",
                  "enabled": "false"
                }
              }
            }
          },

The pod annotations for that deployment would look like:

      consul.hashicorp.com/connect-inject: "true"
      consul.hashicorp.com/transparent-proxy: "false"
      consul.hashicorp.com/service-tags: "spring-boot"
      consul.hashicorp.com/service-metrics-path: '/actuator/prometheus'
      consul.hashicorp.com/connect-service-upstreams: "kafka:9094:dc1"
      consul.hashicorp.com/connect-service: "test-service"

Hope this helps provide some pointers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants