-
Notifications
You must be signed in to change notification settings - Fork 708
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Explicit support for auth change during reconnect #1694
Comments
Good morning - I'd like to follow up here, as this issue is basically preventing several applications to adopt NATS in our organization. In the meantime, our canary application is doing a complicated reset procedure that establishes a new connection, but its onerous and teams aren't going to adopt it. All we need is a way to update the password that the next connection attempt will use in the disconnectCB (which is again called on Auth Callout expiry). |
Awesome. A simple callback along the lines of DisconnectedCB (it can even be called from DisconnectedCB if there is no other signal that the AC expired) that allows resetting the password (our passwords are themselves expiring JWTs) would be perfect. |
Hey @tommyjcarpenter. While I don't think manually changing We could expose a callback options similar to func UserInfoHandler(cb AuthUserInfoHandler) Option {}
Would you be satisfied with such solution? |
@piotrpio I would be thrilled with that. As long as its called on each reconnect, which can happen either due to auth callout expiry, or just a connection drop. Our internal JWTs expire every few minutes, so this CB could be needed even on a normal drop. Thank you so much, this will unblock a large part of our org from adopting this - given how our organization's IAM stack works, with expiring passwords, and given we'd like to set the auth callout JWTs to expire (IDK, Daily maybe), this is needed to bring it all together. |
Just curious: implementation wise, where would this AuthUserInfoHandler "plug into" that isn't a race in the same way that changing Because I noticed we don't have a race when the client is initially seeded with only one NATS server, but only in cases where they are given a NATS url like this took me a lot of debugging before actually finding that race code comment 😅 |
There is a race because you are potentially changing With callback, the option is set once and executed in the same routine as the reconnect process (synchronously), so there is no risk of that. It'll be called somewhere here and the resulting user/password will be used in |
Awesome. Let me know if I can help with reviews or anything in any way. |
Great! When will a release be cut for this? |
We have a few more things to add for the next release so I would say early net month is viable. |
@piotrpio im going to |
@piotrpio it worked beautifully 😭 thanks again One thing I will note is that I didn't see a setter, I'm doing it like this: // userInfoCB is a callback function for the NATS client to get the Identity JWT on auth callout JWT expiry
// see https://github.com/nats-io/nats.go/issues/1694
func (c *Canary) userInfoCB() (string, string) {
c.logger.Infof("Getting Identity JWT from %s", socketpath)
iamCreds, err := c.iamRuntime.GetAccessToken(c.iamCtxt, &identity.GetAccessTokenRequest{})
if err != nil {
c.logger.Fatalf("Failed to get access token: %v", err)
}
return "nats-canary", iamCreds.Token
}
// AddNATSConnAC adds a NATS connection using the IAM Runtime
func (c *Canary) AddNATSConnAC() *Canary {
c.logger.Infof("Creating new NATS connection")
newNC, err := nats.Connect(
natsURL,
nats.UserCredentials(sentinelFile),
nats.UserInfoHandler(c.userInfoCB),
)
if err != nil {
c.metrics.Collectors[NATSConnErrCounter].(*prometheus.CounterVec).With(prometheus.Labels{"kind": c.kind}).Inc()
c.logger.Fatalf("Failed to create NATS connection: %v", err)
}
c.logger.Infof("New NATS connection established")
c.metrics.Collectors[NATSConnEstablished].(*prometheus.CounterVec).With(prometheus.Labels{"kind": c.kind}).Inc()
c.NatsConn = newNC
c.addJetStream() // use the new connection to create the JetStream context
return c
} |
Heyo - Is this going to be included in a release anytime soon? Thank you! |
Hey! Yes, we'll be releasing the client this week. |
Proposed change
I am experiencing "fuzzy" behavior regarding the following. Fuzzy as in, the code works when the NATS client is connected to a single URL, but it doesnt work when connected to multiple URLs.
We are using the NATS Auth Callout, but client side, the Password field needs to change each time the disconnect callback is called, which happens when the auth callout JWTs expire (eg hourly). Specifically, our autth callout backend is interpreting a "password" specific to our in house IAM stack, and those also expire.
Because our auth callout tokens expire every hour, our
disconnectedCB
is called every hour. And in that, we want to changeconn.Opts.Password
so that it uses the new password (the client has to regenerate one at this time) on the reconnection attempt (see below, the actual auth call is made after the reconnected callback, so "hook wise", we already have a hook at the right spot.)My functions currently looks like this:
I think it has something to do with this warning which I just saw..
nats.go/nats.go
Lines 537 to 538 in e3df53d
It works in the case where the client only was given one cluster URL but not multiple.
I eventually tracked down this call chain on reconnects:
nats.go/nats.go
Lines 2781 to 2787 in e3df53d
nats.go/nats.go
Line 2873 in e3df53d
nats.go/nats.go
Line 2886 in e3df53d
nats.go/nats.go
Line 2380 in e3df53d
nats.go/nats.go
Line 2644 in e3df53d
nats.go/nats.go
Lines 2551 to 2566 in e3df53d
that code makes it unclear when this
Password
change is going to be reflected.My fear/suspicion is that, in the multiple URL case, that
User
is populated; triggering theif
, which is why it sends the old token, which is why it breaks. In the single URL case, that is not populated, thus triggering theelse
I am trying to see how I can explicitly force it to use a new password, either via setting some variable sure to be picked up, or via a callback. Maybe a flag that says "always override the auth from the connection options".
This block might be related, I was trying to find anywhere where
nc.current.url
is set:nats.go/nats.go
Lines 1821 to 1827 in e3df53d
Use case
We are using the NATS Auth Callout, but client side, the Password field needs to change each time the disconnect callback is called, which happens when the auth callout JWTs expire (eg hourly). Specifically, our autth callout backend is interpreting a "password" specific to our in house IAM stack.
Contribution
No response
The text was updated successfully, but these errors were encountered: