Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[http server] Batch requests #292

Merged
merged 20 commits into from
May 4, 2021
Merged

[http server] Batch requests #292

merged 20 commits into from
May 4, 2021

Conversation

dvdplm
Copy link
Contributor

@dvdplm dvdplm commented Apr 26, 2021

Implement http server batch requests.

Spec.

We diverge from the spec in a few places:

  • sending a batch of notifications should receive the empty string, "", in response. Instead we return {"jsonrpc":"2.0","result":"","id":null} for each notification
  • sending an invalid batch with numbers, e.g. [123] we should return an Invalid Request error (which we do) with the id set to null (which we don't; somehow serde parses the 123 into the id and I don't think it's worth the hassle to fix that)
  • sending a batch of several invalid requests, e.g. [1, 2, 3], should return one Invalid Request for each, but to us that's a Parse Error. Again I think that it's not really worth the hassle to fix.

One thing that annoys me with this PR is that if we decide to support notifications properly, the number of serde_json::from_slice calls will be even bigger than it is now (3 for the worst case), so the lack of support for untagged enums with RawValue becomes really problematic. We'd have to get pretty creative at that point. :/

@dvdplm dvdplm requested review from niklasad1 and insipx April 29, 2021 17:43
@dvdplm dvdplm self-assigned this Apr 29, 2021
Copy link
Contributor

@insipx insipx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does worry me that we don't return exactly to spec, but at the same time I don't think any of these things being slightly off will lead to system-critical errors so I'm pretty sure it's fine too

@@ -153,6 +157,24 @@ impl Server {
Ok::<_, HyperError>(service_fn(move |request| {
let methods = methods.clone();
let access_control = access_control.clone();

let execute =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to make this a standalone function? the start fn is already fairly long, i think it would be good to move some code out

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know, and I agree, and I tried but couldn't make it work; I'd have to look up the method from the methods hash map in start() and pass it along so it didn't really shorten up the code much.

Lately I've started to shift my views on when it's right to refactor for briefness. The rule of thumb is still "terse is good", but for some cases I've started to think it's alright to keep the code long when it's "the main loop", whatever that means for each specific case, i.e. the most important thing that a given program does. Splitting things up too much can force the reader to jump around more than is ideal and even if it's long it is sometimes more readable to keep it all in one place. I know this is hand wavey (and for this case here I wanted to do exactly what you suggest), but yeah, just wanted to share those thoughts.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's mainly because of the closure that is passed to hyper but yeah this changes made the code really quite hard to read, it wasn't great before either.

Maybe we split it the response handling to helper functions, basically you have to read the bytes here to take ownership over it which used to later to build a message to send to the background task.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I found this more readable but it's still quite long ^^

(maybe because I wrote it lol)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

K, I tried another approach, moving execute to be a method on Server so I can access self.root (i.e. the function pointers we call to run the request), but it doesn't work because self is moved into the closure when calling make_service_fn().
On the client side it's a bit easier because we don't have to call anything. Or I'm too limited to see how to do it! Pointers welcome!

Copy link
Contributor

@insipx insipx May 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could just add another parameter, &Methods (we'd have to import methods from jsonrpsee_util) to the execute function so execute becomes:

fn execute(methods: &Methods, id: Option<&RawValue>, tx: RpcSender, method_name: &str, params: Option<&RawValue>) {
	if let Some(method) = methods.get(method_name) {
		let params = RpcParams::new(params.map(|params| params.get()));
		// NOTE(niklasad1): connection ID is unused thus hardcoded to `0`.
		if let Err(err) = (method)(id, params, &tx, 0) {
			log::error!("execution of method call {} failed: {:?}, request id={:?}", method_name, err, id);
		}
	} else {
		send_error(id, tx, JsonRpcErrorCode::MethodNotFound.into());
	}
}

and then call it like execute(&methods, id, &tx, &method_name, params);

I guess the tradeoff is having a long function signature for execute, though.

I think I agree with your analysis of main_loops though. Often it can be hard to make them shorter and would result in more confusing code than just keeping it all together. In this case I think execute fn is self-explanatory enough in that it just executes the right function for the rpc call.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@insipx I could have sworn I tried that; your version works too. I can go either way here, @niklasad1 thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW the performance of @insipx's version is identical.

http-server/src/server.rs Outdated Show resolved Hide resolved
http-server/src/server.rs Show resolved Hide resolved
benches/bench.rs Outdated Show resolved Hide resolved
@niklasad1
Copy link
Member

niklasad1 commented Apr 29, 2021

sending a batch of notifications should receive the empty string, "", in response. Instead we return {"jsonrpc":"2.0","result":"","id":null} for each notification

Technically, we don't support notifications so those are registered as ordinary method calls and treated as that. So we would need have register_notification method that handles them differently to work according to the spec but the response looks wrong anyway.

sending an invalid batch with numbers, e.g. [123] we should return an Invalid Request error (which we do) with the id set to null (which we don't; somehow serde parses the 123 into the id and I don't think it's worth the hassle to fix that)

Can you elaborate what you mean by that? Is it "id":[123]? That would be parsed a number if you call RpcParams::one on it would work yes. Maybe we should have typed Id similar to what we have in the client.

sending a batch of several invalid requests, e.g. [1, 2, 3], should return one Invalid Request for each, but to us that's a Parse Error. Again I think that it's not really worth the hassle to fix.

Right, that's sounds very hard to do properly without an nested enum. Maybe enough to motivation to switch back to huge output enum ^^


let req = r#"[
{"jsonrpc": "2.0", "method": "notif", "params": [1,2,4]},
{"jsonrpc": "2.0", "method": "notif", "params": [7]}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this probably because Id is regarded a None not sure, so a typed Id could fix this

Copy link
Member

@niklasad1 niklasad1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm favor of ditching serde_json::value::RawValue this sounds to hard to do properly without untagged enums. We could open a tracking issue to that serde issue and migrate when it's fixed but I guess it's not anytime soon.

//cc @maciejhirsz any ideas or opinions ^

dvdplm and others added 2 commits April 30, 2021 08:03
Co-authored-by: Andrew Plaza <aplaza@liquidthink.net>
Co-authored-by: Andrew Plaza <aplaza@liquidthink.net>
@dvdplm dvdplm marked this pull request as ready for review April 30, 2021 06:15
@dvdplm
Copy link
Contributor Author

dvdplm commented Apr 30, 2021

sending a batch of notifications should receive the empty string, "", in response. Instead we return {"jsonrpc":"2.0","result":"","id":null} for each notification

Technically, we don't support notifications so those are registered as ordinary method calls and treated as that. So we would need have register_notification method that handles them differently to work according to the spec but the response looks wrong anyway.

Yeah, I think it's fine to be non-spec compliant for notifications. We explicitly do not support them, so as long as we're not doing anything completely bonkers I think it's ok. I don't know what the purpose of notifications is but if I were to guess it's sort of a performance optimization? Either way: are you ok keeping the tests and comments around as support for any future dev that works on adding notification support?

sending an invalid batch with numbers, e.g. [123] we should return an Invalid Request error (which we do) with the id set to null (which we don't; somehow serde parses the 123 into the id and I don't think it's worth the hassle to fix that)

Can you elaborate what you mean by that? Is it "id":[123]?

No! It's … "id": 123}, so serde actually parses [123].

Maybe we should have typed Id similar to what we have in the client.

The id is typed in the server, Id::Num(…)/Id::Null/Id::Str(…) – or what do you mean?

Right, that's sounds very hard to do properly without an nested enum. Maybe enough to motivation to switch back to huge output enum ^^

I'd need to understand the tradeoffs better tbh. Not in this PR though!

@niklasad1
Copy link
Member

The id is typed in the server, Id::Num(…)/Id::Null/Id::Str(…) – or what do you mean?

I mean in the v2::response::JsonRpcResponse, either to change to have the concrete type Id or manually check it/parse

benches/bench.rs Outdated Show resolved Hide resolved
};
send_error(id, &tx, code.into());
}
rx.close();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose this is performed to mark to that we are done sending any further messages?!
Such that we the receiver will receive eventually Ok(None) when all messages has been sent?

I think it deserves a comment because it was not straightforward to me at least

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes exactly, I read this in the docs:

Closes the receiving half of a channel without dropping it.
This prevents any further messages from being sent on the channel while still enabling the receiver to drain messages that are buffered.

I don't know if it's necessary to close the channel in this case but thought maybe it's a "best practice" to do so and if we ever decide to execute batch requests on separate tasks (and threads) with deadlines it's good to make sure the channel can't be written to. I'll add a comment.

Copy link
Member

@niklasad1 niklasad1 Apr 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, I think so because otherwise you have to drop the sender to close the channel. Because we have control of over both the sender and receiver (the sender is just borrowed by the call closure)

Copy link
Member

@niklasad1 niklasad1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks alright, most of the issues that appeared in this PR is related #276

but I have some additional questions ^^

Add more comments
Factor out batch response collection
@dvdplm dvdplm requested a review from maciejhirsz April 30, 2021 15:39
Comment on lines 170 to 177
if let Err(err) = (method)(id, params, &tx, 0) {
log::error!(
"execution of method call {} failed: {:?}, request id={:?}",
method_name,
err,
id
);
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@niklasad1 What does the connected client see in this case, i.e. when the method they're calling fails? Shouldn't we be sending back the error here in addition to logging it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that what you're doing in #295 perhaps?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, #295 should fix that but it doesn't log it FWIW

@dvdplm dvdplm requested a review from insipx May 3, 2021 12:47
@insipx
Copy link
Contributor

insipx commented May 3, 2021

I'm favor of ditching serde_json::value::RawValue this sounds to hard to do properly without untagged enums. We could open a tracking issue to that serde issue and migrate when it's fixed but I guess it's not anytime soon.

Yeah, I find it interesting that the serde attempt to solve this is sortof blocked on Associated type defaults being included in rust, but there's no knowing if that will be done soon. Maybe worth revisiting anyway since it's been ~2 years?

@dvdplm
Copy link
Contributor Author

dvdplm commented May 4, 2021

blocked on Associated type defaults being included in rust, but there's no knowing if that will be done soon. Maybe worth revisiting anyway since it's been ~2 years?

TIL this was merged in March, so maybe there's hope?

@dvdplm dvdplm merged commit 2cae10b into master May 4, 2021
@dvdplm dvdplm deleted the dp-batch-requests branch May 4, 2021 13:08
@dvdplm dvdplm mentioned this pull request May 5, 2021
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants