-
Notifications
You must be signed in to change notification settings - Fork 699
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Subject interning #824
Subject interning #824
Conversation
nats.go
Outdated
@@ -2692,7 +2692,10 @@ func (nc *Conn) processMsg(data []byte) { | |||
} | |||
|
|||
// Copy them into string | |||
subj := string(nc.ps.ma.subject) | |||
subj := sub.Subject | |||
if subj != string(nc.ps.ma.subject) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no allocations here in case of plain subject without wildcard used for subscription
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When you do a subscription those subjects can contain wildcards, and we have utilities to test for that. When a NATS application receives a message they are always on literal subjects (no wildcards).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you point me where can I find them? Somewhere in the nats-server? Then it's possible to extend Subscription
struct with wildcard bool
or smth like that and change this subj != string(nc.ps.ma.subject)
to this sub.wildcard
and make it O(1) in all scenarios instead of O(n) in worst case scenario.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But for wildcard, are 2 copies made here? since we call string(nc.ps.ma.subject) twice? So yes, maybe evaluating if the subscription is wildcarded when created (so only once) and use that here would be better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kozlovic no, because of Go compiler optimisation take place here, just for reference: bytes.Equal works same way if you check internals
// Neither cmd/compile nor gccgo allocates for these string conversions.
return string(a) == string(b)
Thanks, will have a few others take a look. |
I believe if you only store and retrieve from the map when a message is received that would solve the problem. |
in case of high amount of different subjects (matching wildcard) we can overflow map and it will grow large, may be inappropriate, it needs to be sometimes cleaned and requires more logic, also to check for isWildcard looks much simpler because of we know beforehand type of subscription subject plain or wildcard and it cannot change. Later we can optimise interning for case of |
I was assuming you would use an LRU of some sort etc.. |
type Suscription struct {
// ...
// Whether wildcard used in Subject
wildcard bool
// ...
} to struct // wildcard will check a subject name for wildcardness.
func wildcard(subj string) bool {
return strings.ContainsAny(subj, "*>")
} it will be used sub := &Subscription{Subject: subj, Queue: queue, mcb: cb, conn: nc, jsi: js, wildcard: wildcard(subj)} and subj := sub.Subject
if sub.wildcard {
subj = string(nc.ps.ma.subject)
} so it will be much better than comparing string for each message processed. |
Each message you receive has a non-wildcard subject. I think a low contention LRU would be fine here when receiving a message. Note that with JetStream the subject will match the origin subject not the low level one, so even if you did a non-wildcard subscription you still need to check the subject. |
Ah right, with JetSteam the incoming message subject will likely NEVER match the subscription's subject, even if non-wildcard'ed. At this point (adding a map, etc..) I am really not sure if that will yield any benefit. We may trade one mem copy with CPU time that will make things worse. But if it can be evaluated that it does improve (for wildcard, non wildcard and JetStream), why not. |
I am not known with the way JetStream works yet, I'll check it |
@kozlovic it's not match subject passed to js.SubjectSync, etc but it may match internal created Subscription.Subject or not match depending on wildcard it is or not. |
No. The subject passed to js.Subscribe() is not the one going to be used by the internal subscription created in js.Subscribe() call. This will likely be a non-wc, but the point is that it could be a subscription on say "_INBOX.1", but the server will send to this subscription a message on subject "foo". This is because the server knows the subscription ID, but rewrites the subject. So you can't rely on the fact that the incoming message's subject will be the subscription's subject (even with no WC), at least for JetStream. |
Looks like I had a typo in previous build) yes you are right but delivery passed to subscribeLocked so wildcard function detect it anyway. |
Maybe it's time to start doing smth like this? instead of single line, looking forward for your suggestions. sub := &Subscription{
Subject: subj,
Queue: queue,
mcb: cb,
conn: nc,
jsi: js,
wildcard: wildcard(subj),
} |
But again, for JetStream, the subscription may be done on "_INBOX.1" but message may arrive on "foo". This has nothing to do with wildcards. At the very least, you should do this change only for |
Why is that happening, is JetStream some sort of broker that's why messages could be messed up? |
JetStream is the persistence layer added to core NATS. It rewrites subjects. So you can't really write a test for this outside of using JetStream, unless you write your own "hand crafted" server, which we have in some tests... |
@kozlovic sounds reasonable, added check for jsi != nil || wildcard |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, let's squash into single commit before merging
utilities @derekcollison mentioned inside nats-server // Determine if a subject has any wildcard tokens.
func subjectHasWildcard(subject string) bool {
return !subjectIsLiteral(subject)
}
// Determine if the subject has any wildcards. Fast version, does not check for
// valid subject. Used in caching layer.
func subjectIsLiteral(subject string) bool {
for i, c := range subject {
if c == pwc || c == fwc {
if (i == 0 || subject[i-1] == btsep) &&
(i+1 == len(subject) || subject[i+1] == btsep) {
return false
}
}
}
return true
} But they are unexported. |
We could port, or tokenize the subject, which makes the code simpler (the one in server is written for performance, not readability). But even your current approach is ok to me since the worse case is that a string does not really have real wildcard, just |
The recent PR #824 tried to optimize reducing memory copy by using the non JS, non wildcard subscription's subject as the message subject. However, in case where a NATS subscription is used for the delivery subject of a AddConsumer call, then this breaks and the message would be received with the subscription subject instead of the subject that we get from MSG parsing. This PR basically reverts changes from #824 and add a subject to make sure that we catch such issue in the future. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
@moredure For your information: we had to release version v1.12.3 that reverts changes from your PR. There was unfortunately issues with non JS subscriptions used to receive messages from a JetStream consumer. In that case, message's subject was rewritten but we would use the subscription's subject (say the inbox, instead of "foo"). I added a test that will now make sure that any change to try to optimize things still don't break this use case. |
Aha, see, I guess I needed to make this change before jetmagic happens) ok, I will investigate this thing more deeply later since this change fits my use case and not producing bugs with invalid subscription. subj := sub.Subject
if subj != string(nc.subject) {
subj = string(nc.subject)
} |
Yes, at least in the context of JetStream (for now). |
@kozlovic I don't know, to make another pull request with such a behaviour or is it better to postpone until better times?) |
Would have to profile because it seems to me that doing what you suggest may still result in a copy (simply for the comparison, or maybe compiler optimize this, or we use bytes.Equal(), etc..). If you want and can provide some bench with various subscription types (wildcard, non wildcard, and various subjects length) and see if performance actually increases, etc.. this would help. But I understand that all this takes time and you may not have the time since your modification works well for your use case. Thanks again for the contribution, regardless. |
No it's not, same thing with |
@derekcollison Please review. The price of this comparison near 2ns per operation but in case of plain subscriptions it reduces single allocation per operation and results in speedup. Looking forward for your comments. If somehow to know beforehand that subscription using wildcard then it's possible to make it no cost. So I suggest to merge this pull request first, than make some changes to subscription struct to contain isWildcard bool field and then switch to using it. so no need to compare strings just single if isWildcard and nothings else.