Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: add protection in rest service to always publish with timestamp if user doesn't provide it #2261

Merged
merged 7 commits into from
Dec 6, 2023

Conversation

Ivansete-status
Copy link
Collaborator

Description

Before this PR, the relay REST service accepted "publish" events without timestamp info.
Therefore, the Postgres database ended up with the following message set:
image

That caused pagination issues within the Store protocol because the sender_time attribute (part of the Cursor object) was always set to 0.

Changes

Issue

This was noticed while dogfooding status.test in the context of status-im/infra-status#37

Copy link

github-actions bot commented Nov 30, 2023

You can find the image built from this PR at

quay.io/wakuorg/nwaku-pr:2261

Built from 6b768ba

@Ivansete-status Ivansete-status self-assigned this Nov 30, 2023
@Ivansete-status Ivansete-status marked this pull request as ready for review November 30, 2023 16:52
@jm-clius
Copy link
Contributor

Interesting. Can you expand a little bit more on why the 0 timestamps caused issues with pagination? Although the timestamp attribute is mandatory in the Waku Network (which is a certain restricted way of using Waku), the raw protocol does not require timestamps to be present: https://rfc.vac.dev/spec/14/#message-attributes
The only issue this caused for Store was how to do the sorting in the archive, but for this we introduced a storedAt timestamp which would always be non-zero. As far as I can see, 0 timestamps should not affect cursor creation or lookups? As long as the Index combination remains unique, the value of the fields constituting the cursor should not matter?

@gabrielmer
Copy link
Contributor

I think we've discussed before about automatically adding in REST the current timestamp if the user does not set it in the request.
If that's the case, shouldn't we add the timestamp for the user instead of responding with an error? or am I missing something and I'm mixing different things? 😅

@Ivansete-status
Copy link
Collaborator Author

Ivansete-status commented Dec 1, 2023

Interesting. Can you expand a little bit more on why the 0 timestamps caused issues with pagination? Although the timestamp attribute is mandatory in the Waku Network (which is a certain restricted way of using Waku), the raw protocol does not require timestamps to be present: https://rfc.vac.dev/spec/14/#message-attributes The only issue this caused for Store was how to do the sorting in the archive, but for this we introduced a storedAt timestamp which would always be non-zero. As far as I can see, 0 timestamps should not affect cursor creation or lookups? As long as the Index combination remains unique, the value of the fields constituting the cursor should not matter?

Thanks for the comment!
Let me elaborate more.
Let's imagine a Store request that needs a couple of pages and let's consider that all the messages are archived with sender_time == 0 as shown in the pic in the PR description.

In this case, the very first page will be returned with the following cursor definition (notice that senderTime is zero in the cursor .)

cursor: response.cursor.map(proc(cursor: ArchiveCursor): HistoryCursor = HistoryCursor(pubsubTopic: cursor.pubsubTopic, senderTime: cursor.senderTime, storeTime: cursor.storeTime, digest: cursor.digest)),

After that, a go-Store client will ask for the second page of messages from the following code. Notice that the senderTime will be zero in this case too:

https://github.com/waku-org/go-waku/blob/16d59f37d02a8375466c027f876acd05621bc9f6/waku/v2/protocol/store/waku_store_client.go#L457-L459

https://github.com/waku-org/go-waku/blob/16d59f37d02a8375466c027f876acd05621bc9f6/waku/v2/protocol/store/pb/store.pb.go#L196

https://github.com/waku-org/go-waku/blob/16d59f37d02a8375466c027f876acd05621bc9f6/waku/v2/protocol/store/pb/store.pb.go#L126

After that, the following if statement will happen, considering it a wrong request, in nwaku, and therefore the rest of the Store requests won't proceed. In other words, only the very first page is delivered to the store client.

if not ?pb.getField(3, senderTime):
return err(ProtobufError.missingRequiredField("sender_time"))

Set now timestamp to the WakuMessage if the client didn't send this
information in the request.

This could happen in a request like:

curl -X POST
"http://127.0.0.1:8646/relay/v1/messages/%2Fwaku%2F2%2Fdefault-waku%2Fproto"
-H "content-type: application/json"  -d
'{"payload":"'${payload}'","contentTopic":"my-ctopic"}'
@Ivansete-status Ivansete-status force-pushed the fix-timestamp-zero-protection branch from 552bc3d to a7213c4 Compare December 1, 2023 11:47
@Ivansete-status
Copy link
Collaborator Author

Thanks for the suggestion @gabrielmer !
I've applied your recommendation. I think it gets better now!

Copy link
Contributor

@SionoiS SionoiS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@@ -72,7 +72,7 @@ suite "Waku Archive - message handling":

## Then
check:
(waitFor driver.getMessagesCount()).tryGet() == 1
(waitFor driver.getMessagesCount()).tryGet() == 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would change the test name. also it should be deliver not driver 🤣

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would change the test name. also it should be deliver not driver 🤣

Good point! I changed the whole tests' names, which is a little bit beyond the scope of this PR.

@Ivansete-status Ivansete-status force-pushed the fix-timestamp-zero-protection branch from f541d4b to bc4620e Compare December 1, 2023 13:21
Copy link
Contributor

@gabrielmer gabrielmer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Thanks so much! 🤩

Copy link
Contributor

@jm-clius jm-clius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the detailed response!

After that, the following if statement will happen, considering it a wrong request, in nwaku, and therefore the rest of the Store requests won't proceed. In other words, only the very first page is delivered to the store client.

Ah, I see now why this happens. However, I think the issue here is that nwaku enforces the existence of a senderTimestamp field in violation of the current RFC. We may indeed change to a timestamp-mandatory store implementation, but that would involve an RFC change and a more extensive data schema review (storedAt receiverTimestamp become obsolete), as I mention below. A good time to do that would be in the new, reviewed Store protocol that @ABresting is working on, where timestamp is likely to be mandatory.

I do agree with the REST API adding timestamps on the user's behalf though. For now to proceed, I'd be happy to approve two different PRs:

  1. REST API changes to add timestamp when publishing new messages (received messages should never be modified)
  2. Remove strict requirement for sender timestamp in cursor in nwaku (fixing compliance with RFC).

Comment on lines 70 to 73
var timestamp = msg.timestamp.get(0)

if timestamp == 0:
timestamp = getNanosecondTime(getTime().toUnixFloat())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bit confused by this. Where is this used? Afaics this is to convert received filter messages (as a client) to Waku Messages, although I can't find where this is called in the implementation. What would be the purpose of modifying timestamps in received messages on the API?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, this was thought to be a change aligned to the Store change. I can remove it if that doesn't make sense to you in this PR.

Comment on lines 48 to 49
if msg.timestamp == 0:
return ok()
return err(timestampIsZero)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although I agree with the REST API adding timestamps on behalf of the application when publishing messages. I still don't think the Store should (for now) enforce sender timestamps. Doing so goes against the existing RFC. Furthermore, this obsoletes the entire receiverTimestamp, storedAt, etc. mechanism which is a larger cleanup exercise. We may indeed decide to follow this route (and perhaps should have from the beginning), but deprecating the RFC provision for optional timestamps is a bit larger, involves an RFC change and a more extensive review of the existing data scheme.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comment!
The https://rfc.vac.dev/spec/13/ RFC enforces the existence of senderTime in the Store request, although it is true that the RFC doesn't enforce any value. This can be seen because the senderTime is not optional in the protobuf definition:
image

On the other hand, in the archive.nim module we are enforcing the senderTime to be within a certain time window:

let
now = getNanosecondTime(getTime().toUnixFloat())
lowerBound = now - MaxMessageTimestampVariance
upperBound = now + MaxMessageTimestampVariance
if msg.timestamp < lowerBound:
return err(invalidMessageOld)
if upperBound < msg.timestamp:
return err(invalidMessageFuture)

... but it is weird that we accept msg.timestamp == 0 , isn't it?

@Ivansete-status Ivansete-status force-pushed the fix-timestamp-zero-protection branch from aebc912 to 9c37289 Compare December 4, 2023 19:02
@jm-clius jm-clius self-requested a review December 5, 2023 16:18
Copy link
Contributor

@jm-clius jm-clius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! LGTM now.

@Ivansete-status Ivansete-status changed the title fix: Disallow archiving messages that don't have timestamp (sender_time) info or is set to 0 fix: add protection in rest service to always publish with timestamp if user doesn't provide it Dec 6, 2023
This is needed because we have the following in waku_archive/archive.nim

method validate*(validator: DefaultMessageValidator, msg: WakuMessage):
ValidationResult =
  if msg.timestamp == 0:
    return ok()
@Ivansete-status Ivansete-status force-pushed the fix-timestamp-zero-protection branch from 4bf048a to 499f107 Compare December 6, 2023 11:41
@Ivansete-status
Copy link
Collaborator Author

After a chat with @jm-clius , we left the waku_archive/archive.nim with the following:

method validate*(validator: DefaultMessageValidator, msg: WakuMessage): ValidationResult =
  if msg.timestamp == 0:
    return ok()

Therefore, messages with timestamp zero are allowed. Kindly ask Hanno for further details as to why this is interesting as he has a better understanding of the system :D

@Ivansete-status Ivansete-status merged commit 42f1957 into master Dec 6, 2023
9 of 10 checks passed
@Ivansete-status Ivansete-status deleted the fix-timestamp-zero-protection branch December 6, 2023 13:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants