Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nil pointer dereference in new Jetstream API when getting consumer #1422

Closed
evanofslack opened this issue Sep 27, 2023 · 5 comments · Fixed by #1426
Closed

Nil pointer dereference in new Jetstream API when getting consumer #1422

evanofslack opened this issue Sep 27, 2023 · 5 comments · Fixed by #1426
Assignees
Labels
defect Suspected defect such as a bug or regression

Comments

@evanofslack
Copy link
Contributor

What version were you using?

server: 2.9.15
client: 1.28.0

What environment was the server running in?

Server running on x86 from 2.9.15-alpine container

Is this defect reproducible?

Has occurred multiple times during testing. Working towards reproducing, havent gotten a minimal example yet.

Given the capability you are leveraging, describe your expectation?

There are multiple pods running the same code, and each pod will attempt to get a certain consumer by name and if the consumer doesnt exist, will attempt to create the consumer.

This bug tends to occur on pod startup, where it have a list of consumers to try and get/create. This bug occured typically when working with 500- 10,000 consumers.

The consumers are on a work queue, so the CreateOrUpdate command does not work, as it returns the error about filters needing to be unique on work queue. The consumers are ephemeral.

The expectation is that there are no crashes due to nil pointer dereferences.

Given the expectation, what is the defect you are observing?

panic: runtime error: invalid memory address or nil pointer dereference                                                           │
[signal SIGSEGV: segmentation violation code=0x1 addr=0x50 pc=0xbfcb19]                                                           │
                                                                                                                                  │
goroutine 6151 [running]:                                                                                                         │
github.com/nats-io/nats.go/jetstream.getConsumer({0x1dbca10?, 0xc00007cc80?}, 0xc001635140, {0x1953ca0, 0x2}, {0xc002ba48a0, 0xa})│
        /home/users/eslack/go/pkg/mod/github.com/nats-io/nats.go@v1.28.0/jetstream/consumer.go:178 +0x499                         │
github.com/nats-io/nats.go/jetstream.(*jetStream).Consumer(0xc00007f620?, {0x1dbca10, 0xc00007cc80}, {0x1953ca0, 0x2}, {0xc002ba48│
        /home/users/eslack/go/pkg/mod/github.com/nats-io/nats.go@v1.28.0/jetstream/jetstream.go:512 +0x8d                         
@evanofslack evanofslack added the defect Suspected defect such as a bug or regression label Sep 27, 2023
@piotrpio piotrpio self-assigned this Sep 27, 2023
@piotrpio
Copy link
Collaborator

Hey @evanofslack, thank you for reporting the issue. I was not able to reproduce yet, but it looks like we get a response with no error and no consumer config from the server, although I'm not 100% sure yet. It looks like it's a similar issue to this: #1258, where there is also panic when getting a consumer. I'll investigate and let you know.

@evanofslack
Copy link
Contributor Author

Hello @piotrpio thanks for looking into this.

I was able to recreate something similar to this. Please see the code in this gist:
https://gist.github.com/evanofslack/97a995458c3aac6ca337b3baa5bd9d43

@piotrpio
Copy link
Collaborator

piotrpio commented Oct 2, 2023

@evanofslack thank you, in the meantime we were able to find and fix this issue in nats-server: https://github.com/nats-io/nats-server/pull/4610/files

So that will be fixed in the upcoming server release. I will also add a fix in go client to safeguard from panics on older server versions.

@piotrpio
Copy link
Collaborator

piotrpio commented Oct 3, 2023

Both PRs are merged and will be part of upcoming releases.

@evanofslack
Copy link
Contributor Author

Thanks for triaging and fixing! @piotrpio

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
defect Suspected defect such as a bug or regression
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants