-
Notifications
You must be signed in to change notification settings - Fork 151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fog-view: Limit number of user events, and tell client this happened #3151
base: main
Are you sure you want to change the base?
Conversation
Jason reported recently that fog test client failed in test-net with a grpc maximum response size error. Eventually it recovered and I'm not sure we fully determined the root cause. One defect that we noticed during investigation is that, while the client limits the number of ETxOutRecord's it requests, nothing limits the number of user events that fog view returns -- if the client starts from scratch, it always returns all the events. This patch is meant to fix that. The SQL query is changed so that there is a limit to how many user events it returns, and it is ordered ascending by the user event id. This limit is configurable as a server startup parameter. When the limit is reached, we flag to the client that there may be more user events. Then we update the `fog-view-protocol` polling module so that when this flag is set, it keeps asking for more seeds until this flag isn't set any more. I believe that this is a backwards compatible change, because old clients will ignore this flag, and for new clients talking to old servers, this flag will have a default value of false, which matches the status quo behavior of always returning all the results.
I still need to add some new test coverage on this feature |
❌ Heads up, I tried building full-service using this branch and it failed. |
@@ -145,6 +145,11 @@ message QueryResponse { | |||
/// This can be used by the client as a hint when choosing cryptonote mixin indices. | |||
/// This field doesn't have the same "cursor" semantics as the other fields. | |||
uint64 last_known_block_cumulative_txo_count = 9; | |||
|
|||
/// If true, this means that due limits, we could not return all the requested | |||
/// user events in one response. Clients cannot compute an accurate balance check |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the PR description you write: "I believe that this is a backwards compatible change, because old clients will ignore this flag".
But if it's true that "clients cannot compute an accurate balance check until they have received all relevant user events," then how can an (old) client that ignores this flag accurately compute a balance check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question:
- Today, the amount of user events in postgres is not so much that we can't return all the events, and so all the existing clients are fine, and will be fine with servers before and after this change.
- After this change, we can ship clients that will be okay even when that is no longer the case, and the servers will have to do some sort of pagination.
- The backwards compatible part is, new clients that talk to old servers, in the time period before there are too many user events, will get all the events, and the flag will default to false when they deserialize, so they will correctly decide that they got all the events.
let (user_events, next_start_from_user_event_id) = self | ||
.db | ||
.search_user_events(start_from_user_event_id, USER_EVENT_LIMIT)?; | ||
let may_have_more_user_events = user_events.len() >= USER_EVENT_LIMIT; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It appears that we're returning all the user_events
in the query_response
below. If there are 12_000
user_events
, then we return them all to the client AND we set may_have_more_user_events
to true.
However, the logic for fn search_user_events
has this line: .limit(max_num_events as i64)
, and max_num_events
== USER_EVENT_LIMIT
. How then can user_events.len()
ever be > than USER_EVENT_LIMIT
?
I still didn't write tests, I'm gonna work on that more soon, sorry |
Co-authored-by: Sam Dealy <33067698+samdealy@users.noreply.github.com>
Co-authored-by: Sam Dealy <33067698+samdealy@users.noreply.github.com>
Co-authored-by: Sam Dealy <33067698+samdealy@users.noreply.github.com>
Jason reported recently that fog test client failed in test-net with a grpc maximum response size error. Eventually it recovered and I'm not sure we fully determined the root cause.
One defect that we noticed during investigation is that, while the client limits the number of ETxOutRecord's it requests, nothing limits the number of user events that fog view returns -- if the client starts from scratch, it always returns all the events.
This patch is meant to fix that. The SQL query is changed so that there is a limit to how many user events it returns, and it is ordered ascending by the user event id. This limit is configurable as a server startup parameter. When the limit is reached, we flag to the client that there may be more user events.
Then we update the
fog-view-protocol
polling module so that when this flag is set, it keeps asking for more seeds until this flag isn't set any more.I believe that this is a backwards compatible change, because old clients will ignore this flag, and for new clients talking to old servers, this flag will have a default value of false, which matches the status quo behavior of always returning all the results.
--
This may also have secondary effects of limiting how long the sql query takes to process, which may be relevant if the number of events is large. (However, I don't expect more than a thousand events here unless fog-ingest was in a crash loop at some point.)