-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: duplicate message forwarding in filter service #2842
Conversation
You can find the image built from this PR at
Built from 2ad6e43 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Thanks so much!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think an ever growing lookup table is a good solution for this problem.
Please evaluate if TimedCache or similar approach can do it.
Please notice that gossip-sub also does very similar for filtering duplicates from re-publising.
- I don't think we need to track this per peer as a message is instantly pushed to all subscribed peers if that a duplicate, it is certainly dup for all.
So in Florin's test there were no relay involved. He was sending messages via REST API to relay it, but also the same node was serving filter subs. In this case filter gets the messages before relay, hence the message push. So it maybe a bit costly to maintain dup check database for this only case. |
Hi @NagyZoltanPeter, I want to clarify that when I mentioned 'relay message,' I meant 'send message,' not using Waku relay. Let's set aside dynamic subscription for now and focus on the original issue. Regarding your suggestion about not needing to store messages for every peer, I'm considering using a shared msg_hash pool using TimedCache. I have a question: could you please advise on how long we should cache these messages for? The default is set to just 10 milliseconds. |
Ok, I see. Sorry, maybe I was not very clear. In case a message comes filter collects subscribed peers and right away sends them the message. There is no message cache involved here. Back to your question. Gossipsub retains seen message lookup for 2 minutes. I don't think any such solution if applied should hold for more time. - but still it is a question of requirements with it what we would like to prevent with this solution?
|
Thanks, @NagyZoltanPeter. now is more clear to me. If we create and publish a message, and then publish the same message again, why should we prevent our subscribers from seeing it a second time? I assume there might be scenarios where we publish the same message with the same timestamp. If the payload is the same but the timestamp is different, our fix will not stop the second message from being published. However, let's ask Florin for further clarification. Regarding the timeout variable, it's a default variable, so we can adjust it as needed. That's why I'm asking what value we should use for our solution. Default Timeout For the sake of implementation, I will proceed with the timed cache logic to help me better understanding. If, as you mentioned, this concern is not necessary, we will not merge the PR. Hi @fbarbu15, Could you please help us with our confusion ? |
I see, thank you. In this case not the default is used but this.
I think this kind of duplicates is possible if:
Other case, the more generic one is when filter service pushes messages got from relay. But all in all, the message hash dup check with a timed manner is a possible good solution. Just do avoid uncontrolled message hash collection. We do expect to run pretty long and handling enormous amount of messages and this would lead to using all memory after some point. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems much much better!
I added some short suggestions and if you may consider adding unit tests to check dup cases.
waku/waku_filter_v2/protocol.nim
Outdated
wf.messageCache.expire(Moment.now()) | ||
if wf.messageCache.contains(msgHash): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wf.messageCache.expire(Moment.now()) | |
if wf.messageCache.contains(msgHash): | |
if wf.messageCache.put(msgHash): |
I think this is simpler and does exactly the same. It will call expire and will refresh timestamp also.
waku/waku_filter_v2/protocol.nim
Outdated
## expiration refreshed that's why update cache every time, even if it has a value. | ||
discard wf.messageCache.put(msgHash, Moment.now()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
## expiration refreshed that's why update cache every time, even if it has a value. | |
discard wf.messageCache.put(msgHash, Moment.now()) |
This is not needed in case my previous suggestion above applied.
waku/waku_filter_v2/protocol.nim
Outdated
@@ -295,6 +307,7 @@ proc new*( | |||
subscriptionTimeout, maxFilterPeers, maxFilterCriteriaPerPeer | |||
), | |||
peerManager: peerManager, | |||
messageCache: init(TimedCache[string], timeout = 2.minutes), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would personally add the cache timeout as an input argument to the WakuFilter's new because it enables configurability even if most cases default is choosern.
Also it enables to create very easy unit tests that checks the feature correctness automatically.
So in unit test this way you can apply 1.seconds timeout which is more than enough to check such dup cases.
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I agree, this feels hard-coded. Let's set 2 minutes as the default and allow the Waku filter to modify it as needed. Thanks for mentioning the unit case; I'm happy to add a test case.
close #2320
PR Description:
We identified a duplicate issue in the filter node service, which has been resolved in this fix.