-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: return erring response if Lightpush content topic is empty #2083
Conversation
You can find the image built from this PR at
Built from c1515c6 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thank you!
I see that two CI tests are failing...
The second error may be connected to the fact that in the test that I've added ( I'm not sure how to explain the first failing test. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I think the approach here of trying to have a default response is the right one. Changes requested is that the bug is about an unset field and RPC decoding failure (not content topic being an "empty string"). My suggestion is to first try to trigger this error condition in the unit test and then make sure that these failure cases also lead to a properly constructed response with error message. The same condition is triggered when the WakuMessage is encoded in any other non-supported way and can't be decoded, as reported in #2059. This PR should also close that issue.
waku/waku_lightpush/protocol.nim
Outdated
if contentTopic == "": | ||
response = PushResponse(is_success: false, info: some(emptyContentTopicFailure)) | ||
waku_lightpush_errors.inc(labelValues = [emptyContentTopicFailure]) | ||
error "empty content topic", error=emptyContentTopicFailure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't quite interpret the bug in #1641 in this way. The issue is not that the content topic is an "empty string" it's that that field is unset in the protocol buffer, resulting in a wrongly encoded RPC that fails to decode. The same issue states:
Send a light push rpc request that contains a missing protobuf field such as content topic
So it's not related to the content topic specifically, but any missing field that causes an rpc decode failure. Since we still return in lines 45 and 51 on invalid RPC (protobufs), afaict the original issue won't be fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, now I think i understand the issue properly!
If we want to return a PushResponse
after a decoding failure, what should its requestId
be set to? Normally, it is copied from the corresponding field in PushRequest
, but if the request cannot be decoded properly, we can't count on that. Can we just omit the requestId
field in that case? It is not wrapped in Option
in the definition of PushRPC
... Or shall we use some default value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mmm. This is a good point. I would actually guess that in almost all cases decoding fail because of an inner field (either PushRequest
or WakuMessage
) failing to decode, rather than the outer PushResponse
. In other words, it may be possible to change to a two stage decoding process where the requestId
can be extracted before attempting to decode the encapsulated request.
However, for now this may be overkill and the best option is probably to initialise the requestId
as an empty string.
@fryorcraken @danisharora099 since the PushRPC and PushRequest fail to decode properly in this bug, it's difficult to extract a requestId
. Would a decent solution (from js-waku's perspective) be to return a well-formed response, with error code, but with the requestId
set to an empty string?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fryorcraken @danisharora099 since the PushRPC and PushRequest fail to decode properly in this bug, it's difficult to extract a
requestId
. Would a decent solution (from js-waku's perspective) be to return a well-formed response, with error code, but with therequestId
set to an empty string?
This is a fair solution IMO.
We currently do not use the requestId
, we are able to assume the response is on the same multiplex stream than the request (or at least it translates this way in js-libp2p API).
tests/test_waku_lightpush.nim
Outdated
client = newTestWakuLightpushClient(clientSwitch) | ||
serverPeerId = serverSwitch.peerInfo.toRemotePeerInfo() | ||
message = fakeWakuMessage(contentTopic="") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The bug is that an empty response is received, not on content topic being an empty string, but that the Waku Message and PushRPC is constructed without this field and decoding fails.
Ah, I think it's not up to the lightpush service node to interpret this. In other words, it's not about content topics, but about properly constructed protocol buffers and fields. The issue is that the message being published MUST be a well-formed WakuMessage (as in https://rfc.vac.dev/spec/14/). If for any reason this message field (or other field in the PushRequest) fails to decode properly, we should return a proper response. |
Thinking more about this, I feel that I don't grasp some fundamentals of what's happening here. What exactly does it mean to "decode" a message? In particular, how can the following line not fail, if there is no, say, nwaku/waku/waku_lightpush/protocol.nim Line 56 in e85f05b
After all, the left-hand-side
Now I understand it; see my next comment. |
Experimenting a bit more, I am not sure I understand how to reproduce the behavior in question. AFAICT, Nim assigns default values to object fields if they are not provided. I modified a playground example with a simplified version of
The result is:
My point is: how do I generate a |
The problem here is not related to any fields "missing" inside a Nim object - as you say, during construction of the object default values will be filled into each field regardless if initialised or not (sidenote is that For example, this line: nwaku/waku/waku_lightpush/protocol.nim Line 41 in e85f05b
attempts to decode the raw bytes from the wire according to the protobuf definition in https://rfc.vac.dev/spec/19/#payloads. This in turn attempts to decode raw bytes into WakuMessage using this codec: nwaku/waku/waku_core/message/codec.nim Line 33 in e85f05b
However, if WakuMessage does not contain a content topic, the decoding stops and returns an error nwaku/waku/waku_core/message/codec.nim Lines 46 to 49 in e85f05b
True enough, this is not immediately easy to reproduce, as creating a From the perspective of lightpush it's not really necessary to test that decoding fails for every type of spec violation (i.e. we do not have to test for missing content topic specifically - that should be tested with the rest of the |
Makes sense. It's not clear yet how to use these raw bytes. I can no longer use nwaku/tests/test_waku_lightpush.nim Line 104 in e85f05b
Similarly, nwaku/waku/waku_lightpush/client.nim Line 71 in e85f05b
Going down the call stack, it seems that here I can insert random bytes instead of nwaku/waku/waku_lightpush/client.nim Line 46 in e85f05b
I'm unsure though, what is the proper way to bring this logic into the test? Copy-pasting the whole |
Mmm. I see your point. I think this is a good indication that our code is not very unit testable (a sign that concerns aren't well separated). I wouldn't test this from the client perspective, but rather try to test the lightpush (server-side) protocol directly. One way to do this is to extract the code inside the handler https://github.com/waku-org/nwaku/blob/master/waku/waku_lightpush/protocol.nim#L39-L77 that returns a proc handle(conn: Connection, proto: string) {.async, gcsafe, closure.} =
let buffer = await conn.readLp(MaxRpcSize.int)
let response = handleRequest(buffer)
await conn.writeLp(response.encode().buffer) with WDYT of this option? |
Thank you @jm-clius for the tip, I agree with the approach.
It should also probably take a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Logic is good but for the style here's some tips.
- Leave empty lines here and there. It helps readability.
- Use
.valueOr
or.isOkOr
to get the value in options or results. see -> chore(refactoring): apply isOkOr || valueOr approach #1969 - Structure your
if
s to minimize rightward drift.
What would be the best way to reconcile this with avoiding early returns? My idea was that as |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Some minor comments and nitpicks below.
This avoids
|
(Judging by We don't only generate a |
Feel free to ignore it was just a suggestion. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks great indeed! I simply add a recommendation to enrich a little bit the message in case or error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks!
35ecf22
to
3fb5a2b
Compare
This PR may contain changes to configuration options of one of the apps. If you are introducing a breaking change (i.e. the set of options in latest release would no longer be applicable) make sure the original option is preserved with a deprecation note for 2 following releases before it is actually removed. Please also make sure the label |
This PR may contain changes to database schema of one of the drivers. If you are introducing any changes to the schema, make sure the upgrade from the latest release to this change passes without any errors/issues. Please make sure the label |
I tried to retroactively sign my older commits (which GitHub required) but apparently messed something up. Now all my commits in this branch and a prior commit by @vpavlin are squashed together. Shall I squash-and-merge anyway? If so, should a commit message describe my changes on top of the older changes squashed here? The whole thing looks ugly, I must admit, and I'm sorry for that. |
The diff doesn't look right + afaik there is nothing to squash, i can see just 1 commit. Also it seems to be 30 commits behind master. So I would start with rebasing the PR :) |
3fb5a2b
to
cdfb72c
Compare
That was a great idea, thanks @alrevuelta ! Now at least I only see relevant changes in this PR. However, they are now seemingly "authored by Vaclav and committed by me", and this commit is not signed, blocking the merge again (which is why I started this local-squash business in the first place). Any ideas on how to go about this? |
mm as an easy fix I would say:
|
cdfb72c
to
4f0b7b6
Compare
Description
The Lightpush PFC prohibits empty content topics. The implementation did not reflect this. This PR fixes this: the server now checks if the content topic is empty, and if it is, sends an erring response to the client (no interaction with the P2P network occurs). A new test is added to test this behavior.
Note: this PR only considers the "empty content topic" erring case. There are other erring cases (e.g., decoding error), for which the server does not return a response. I think these are to be addressed in a separate issue / branch.
P.S. I wanted to put a link to the RFC that says that content topic must not be empty, but cannot find it (apparently, it's not 19/WAKU2-LIGHTPUSH). Can anyone point to the spec that specifies this restriction?
Changes
How to test
Issue
closes #1641