Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix server_port_subscribers metric in Destination #10820

Merged
merged 1 commit into from
Apr 26, 2023

Conversation

alpeb
Copy link
Member

@alpeb alpeb commented Apr 25, 2023

Fixes #10764

GetProfile streams create a server_port_subscribers gauge that tracks the number of listeners interested in a given Server. Because of an oversight, the gauge was only being registered until the second listener was added. For just one listener the gauge was absent. But whenever the GetProfile stream ended, the gauge was deleted which resulted in this error if it wasn't registered to begin with:

level=warning msg="unable to delete server_port_subscribers metric with labels map[name:voting namespace:emojivoto port:4191]" addr=":8086" component=server

One can check that the gauge wasn't being created by installing viz and emojivoto, and checking the following returns empty:

$ linkerd diagnostics controller-metrics | grep server_port_subscribers

After this fix, one can see the metric getting populated:

$ linkerd diagnostics controller-metrics | grep server_port_subscribers

# HELP server_port_subscribers Number of subscribers to Server changes associated with a pod's port.
# TYPE server_port_subscribers gauge
server_port_subscribers{name="emoji",namespace="emojivoto",port="4191"} 1
server_port_subscribers{name="linkerd",namespace="linkerd",port="4191"} 1
server_port_subscribers{name="linkerd",namespace="linkerd",port="9990"} 1
server_port_subscribers{name="linkerd",namespace="linkerd",port="9995"} 1
server_port_subscribers{name="linkerd",namespace="linkerd",port="9996"} 1
server_port_subscribers{name="linkerd",namespace="linkerd",port="9997"} 1
server_port_subscribers{name="metrics",namespace="linkerd-viz",port="4191"} 1
server_port_subscribers{name="metrics",namespace="linkerd-viz",port="9995"} 1
server_port_subscribers{name="tap",namespace="linkerd-viz",port="4191"} 1
server_port_subscribers{name="tap",namespace="linkerd-viz",port="9995"} 1
server_port_subscribers{name="tap",namespace="linkerd-viz",port="9998"} 1
server_port_subscribers{name="vote",namespace="emojivoto",port="4191"} 1
server_port_subscribers{name="voting",namespace="emojivoto",port="4191"} 1
server_port_subscribers{name="web",namespace="emojivoto",port="4191"} 1
server_port_subscribers{name="web",namespace="linkerd-viz",port="4191"} 1
server_port_subscribers{name="web",namespace="linkerd-viz",port="9994"} 1

And when scaling down the voting deployment, one can see how the metric with name="voting" is removed.

Fixes #10764

`GetProfile' streams create a `server_port_subscribers` gauge that tracks the number of listeners interested in a given Server. Because of an oversight, the gauge was only being registered until the second listener was added. For just one listener the gauge was absent. But whenever the `GetProfile` stream ended, the gauge was deleted which resulted in this error if it wasn't registered to begin with:

```
level=warning msg="unable to delete server_port_subscribers metric with labels map[name:voting namespace:emojivoto port:4191]" addr=":8086" component=server
```

One can check that the gauge wasn't being created by installing viz and emojivoto, and checking the following returns empty:

```bash
$ linkerd diagnostics controller-metrics|ag server.port.subscribers
```

After this fix, one can see the metric getting populated:

```bash
$ linkerd diagnostics controller-metrics | grep server.port.subscribers

# HELP server_port_subscribers Number of subscribers to Server changes associated with a pod's port.
# TYPE server_port_subscribers gauge
server_port_subscribers{name="emoji",namespace="emojivoto",port="4191"} 1
server_port_subscribers{name="linkerd",namespace="linkerd",port="4191"} 1
server_port_subscribers{name="linkerd",namespace="linkerd",port="9990"} 1
server_port_subscribers{name="linkerd",namespace="linkerd",port="9995"} 1
server_port_subscribers{name="linkerd",namespace="linkerd",port="9996"} 1
server_port_subscribers{name="linkerd",namespace="linkerd",port="9997"} 1
server_port_subscribers{name="metrics",namespace="linkerd-viz",port="4191"} 1
server_port_subscribers{name="metrics",namespace="linkerd-viz",port="9995"} 1
server_port_subscribers{name="tap",namespace="linkerd-viz",port="4191"} 1
server_port_subscribers{name="tap",namespace="linkerd-viz",port="9995"} 1
server_port_subscribers{name="tap",namespace="linkerd-viz",port="9998"} 1
server_port_subscribers{name="vote",namespace="emojivoto",port="4191"} 1
server_port_subscribers{name="voting",namespace="emojivoto",port="4191"} 1
server_port_subscribers{name="web",namespace="emojivoto",port="4191"} 1
server_port_subscribers{name="web",namespace="linkerd-viz",port="4191"} 1
server_port_subscribers{name="web",namespace="linkerd-viz",port="9994"} 1
```

And when scaling down the voting deployment, one can see how the metric with `name="voting"` is removed.
@alpeb alpeb requested a review from a team as a code owner April 25, 2023 20:44
@alpeb alpeb merged commit 33adae2 into main Apr 26, 2023
@alpeb alpeb deleted the alpeb/noisy-server-port-subscribers branch April 26, 2023 21:48
risingspiral pushed a commit that referenced this pull request May 4, 2023
Fixes #10764

`GetProfile` streams create a `server_port_subscribers` gauge that tracks the number of listeners interested in a given Server. Because of an oversight, the gauge was only being registered until the second listener was added. For just one listener the gauge was absent. But whenever the `GetProfile` stream ended, the gauge was deleted which resulted in this error if it wasn't registered to begin with:

```
level=warning msg="unable to delete server_port_subscribers metric with labels map[name:voting namespace:emojivoto port:4191]" addr=":8086" component=server
```

One can check that the gauge wasn't being created by installing viz and emojivoto, and checking the following returns empty:

```bash
$ linkerd diagnostics controller-metrics | grep server_port_subscribers
```

After this fix, one can see the metric getting populated:

```bash
$ linkerd diagnostics controller-metrics | grep server_port_subscribers

# HELP server_port_subscribers Number of subscribers to Server changes associated with a pod's port.
# TYPE server_port_subscribers gauge
server_port_subscribers{name="emoji",namespace="emojivoto",port="4191"} 1
server_port_subscribers{name="linkerd",namespace="linkerd",port="4191"} 1
server_port_subscribers{name="linkerd",namespace="linkerd",port="9990"} 1
server_port_subscribers{name="linkerd",namespace="linkerd",port="9995"} 1
server_port_subscribers{name="linkerd",namespace="linkerd",port="9996"} 1
server_port_subscribers{name="linkerd",namespace="linkerd",port="9997"} 1
server_port_subscribers{name="metrics",namespace="linkerd-viz",port="4191"} 1
server_port_subscribers{name="metrics",namespace="linkerd-viz",port="9995"} 1
server_port_subscribers{name="tap",namespace="linkerd-viz",port="4191"} 1
server_port_subscribers{name="tap",namespace="linkerd-viz",port="9995"} 1
server_port_subscribers{name="tap",namespace="linkerd-viz",port="9998"} 1
server_port_subscribers{name="vote",namespace="emojivoto",port="4191"} 1
server_port_subscribers{name="voting",namespace="emojivoto",port="4191"} 1
server_port_subscribers{name="web",namespace="emojivoto",port="4191"} 1
server_port_subscribers{name="web",namespace="linkerd-viz",port="4191"} 1
server_port_subscribers{name="web",namespace="linkerd-viz",port="9994"} 1
```

And when scaling down the voting deployment, one can see how the metric with `name="voting"` is removed.
risingspiral pushed a commit that referenced this pull request May 5, 2023
Fixes #10764

`GetProfile` streams create a `server_port_subscribers` gauge that tracks the number of listeners interested in a given Server. Because of an oversight, the gauge was only being registered until the second listener was added. For just one listener the gauge was absent. But whenever the `GetProfile` stream ended, the gauge was deleted which resulted in this error if it wasn't registered to begin with:

```
level=warning msg="unable to delete server_port_subscribers metric with labels map[name:voting namespace:emojivoto port:4191]" addr=":8086" component=server
```

One can check that the gauge wasn't being created by installing viz and emojivoto, and checking the following returns empty:

```bash
$ linkerd diagnostics controller-metrics | grep server_port_subscribers
```

After this fix, one can see the metric getting populated:

```bash
$ linkerd diagnostics controller-metrics | grep server_port_subscribers

# HELP server_port_subscribers Number of subscribers to Server changes associated with a pod's port.
# TYPE server_port_subscribers gauge
server_port_subscribers{name="emoji",namespace="emojivoto",port="4191"} 1
server_port_subscribers{name="linkerd",namespace="linkerd",port="4191"} 1
server_port_subscribers{name="linkerd",namespace="linkerd",port="9990"} 1
server_port_subscribers{name="linkerd",namespace="linkerd",port="9995"} 1
server_port_subscribers{name="linkerd",namespace="linkerd",port="9996"} 1
server_port_subscribers{name="linkerd",namespace="linkerd",port="9997"} 1
server_port_subscribers{name="metrics",namespace="linkerd-viz",port="4191"} 1
server_port_subscribers{name="metrics",namespace="linkerd-viz",port="9995"} 1
server_port_subscribers{name="tap",namespace="linkerd-viz",port="4191"} 1
server_port_subscribers{name="tap",namespace="linkerd-viz",port="9995"} 1
server_port_subscribers{name="tap",namespace="linkerd-viz",port="9998"} 1
server_port_subscribers{name="vote",namespace="emojivoto",port="4191"} 1
server_port_subscribers{name="voting",namespace="emojivoto",port="4191"} 1
server_port_subscribers{name="web",namespace="emojivoto",port="4191"} 1
server_port_subscribers{name="web",namespace="linkerd-viz",port="4191"} 1
server_port_subscribers{name="web",namespace="linkerd-viz",port="9994"} 1
```

And when scaling down the voting deployment, one can see how the metric with `name="voting"` is removed.

Signed-off-by: Eric Anderson <eric@buoyant.io>
@jbucar
Copy link

jbucar commented Jun 7, 2023

Hello, we have this problem in a recent update with version 2.13.4. Is it possible that this bug persists?

@SansDK
Copy link
Contributor

SansDK commented Jul 12, 2023

We're also seeing this issue on version 2.13.5.

@deusxanima
Copy link
Contributor

Able to confirm that this appears to be getting reported in a few additional user environments.

@adleong
Copy link
Member

adleong commented Jul 25, 2023

Based on the reports here, we have re-opened #10764.

It would be really helpful if anyone who is experiencing this issue on version 2.13.5 could share destination controller logs in a comment on that issue. This will let us see the exact text of the error and help give us the context to reproduce and ultimately fix the error. Thanks! @jbucar @SansDK @deusxanima

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

unable to delete server_port_subscribers metric with labels
6 participants