-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement request parsing #452
Conversation
So far I only looked at half the first commit and I feel like something will be wrong. Could you add a test for xproto's typedef struct xcb_change_save_set_request_t {
uint8_t major_opcode;
uint8_t mode;
uint16_t length;
xcb_window_t window;
} xcb_change_save_set_request_t; I feel like your code might mis-handle this, because with your abstraction, the Am I wrong? (Also, clippy is complaining about one thing) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for my many complaints.
"Big ones" should be: Can we remove RequestHeader
and the checking that the input slice is exhausted at the end of parsing?
} else { | ||
outln!(out, "let {} = remaining.to_vec();", rust_field_name); | ||
outln!(out, "let remaining = &remaining[remaining.len()..];"); | ||
// Only force taking ownership for non-request types. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(unrelated to this PR) What would it take to make "normal struct parsing" also be able to borrow stuff from the input? The hand-written API moved away from this being possible, but how close is the code generator?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd have to look at some of the generated code but I don't think it would be particularly difficult at this point.
#[test] | ||
fn test_randr_get_output_property() { | ||
use x11rb::protocol::randr::GetOutputPropertyRequest; | ||
let header = RequestHeader { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Comment is at a random position) Why do we need RequestHeader
?
I think that it might be possible to do without. The parsing code could just parse the request "normally" without skipping any fields. Opcodes can then be compared against expected values. Things that are not needed can be sent to let _ =
.
RequestHeader
does not bring anything to the parsing code, does it?
Edit: Well... it hides BigRequests
support somewhat. How bad would it be to put BigRequests
support into the generated code "everywhere"? We already do that for request serialisation... kind of but not really (it uses a request size of zero and expects something else to fix things up and add a second request size).
For parsing, this would however be relatively straight forward:
let (short_length, mut remaining) = u16::try_parse(remaining)?
let length = if short_length == 0 {
// BigRequests
let (long_length, new_remaining) = u32::try_parse(remaining)?;
remaining = new_remaining;
long_length
} else {
short_length.into()
}
Well... okay, that is surprisingly long. Perhaps a helper function in x11_utils
for parsing the request length field?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main reason I did it this way is that parsing requests is not stateless. You need to know the state of the connection with the server (in particular whether or not BigRequests
is enabled and what extensions are enabled at what major opcodes) in order to correctly parse requests. We could choose to decide that we don't care about tracking whether BigRequests
is enabled and simply always accept that format. However, even if we did that, someone outside of a FooRequest::try_parse
implementation is always going to have to look at the first two bytes of the request to determine which parsing implementation to feed the buffer to (and for any opcodes above 127 they need to have been parsing the connection all along to know which extension opcodes exist). So RequestHeader
is currently all the stuff that requires external state to process.
I'm not necessarily opposed to changing it but that is the motivation for the current design.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is kind of similar to x11rb::protocol::Event
and Error
. That code also needs to look up extensions based on error code / response type / major opcode. I guess we should also do something similar for request parsing (but that can be done in another PR).
This code uses some ad-hoc functions at the end of the file to extract the necessary info from the &[u8]
.
A possible API for x11rb::protocol::Request
would then be fn parse(request: &[u8], ext_info_provider: &dyn ExtInfoProvider) -> Result<Self, ParseError>
(this is just the existing API for Event
/Error
).
Your version would be -> Result<(RequestHeader, Self), ParseError>
, right? However... does the request header provide anything useful here? The request is already identified by the variant of the enum Self
that is being returned. The request length is known to the caller as .len()
of the provided slice. The difference between major/minor opcode does not really matter, I think.
What kind of API do you have in mind for this? Or did you plan to hand-write the necessary bits outside of x11rb?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in particular whether or not BigRequests is enabled
Well... a big request is an illegal request when BigRequests is disabled, since a length of 0 would mean that there is not even enough space for the length field... So, personally I don't mind to assume that BigRequests
is always enabled. Technically, this could also check the maximum request length, but this seems to be the wrong code to do that (since the request was already received at this point and is in-memory, such handling should be done before everything is read).
I also now looked at the X11 server and actually... there are some requests where it requires an exact length and sometimes where "trailing garbage" is okay, There are several macros for this in include/dix.h
:
REQUEST_SIZE_MATCH
, 860REQUEST_AT_LEAST_SIZE
, 346REQUEST_AT_LEAST_EXTRA_SIZE
, 3REQUEST_FIXED_SIZE
, 107
For some reason, ProcNoOperation
uses REQUEST_AT_LEAST_SIZE
and does not do an exact match. The number after the macro in the list above is the result of git grep MACRO | wc -l
. Seems like: It's complicated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shorter way to parse length fields:
let (length, remaining) = match u16::try_parse(remaining)? {
// BigRequests?
(0, remaining) => u32::try_parse(remaining),
(n, remaining) => (n.into(), remaining),
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something else I realized is that regardless of our decision around BigRequests
whoever is driving parsing has to parse the length themselves anyways to know if they have received enough bytes before calling the parsing method. Since ParseError
doesn't distinguish between various types of errors "insufficient bytes" is otherwise indistinguishable from other types of parsing errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup. That's also how the existing code in e.g. RustConnection
parses errors/events/replies.
Once upon a time, the only possible parse error was "not enough data". Since then, a lot has happened. I wouldn't mind improving ParseError
, but so far I do not really see a good use case for that. Our parsing errors most likely would cause memory bugs in anything with XCB, so apparently all of this "never happens" anyway...
impl From<std::convert::Infallible> for ParseError { | ||
fn from(_: std::convert::Infallible) -> Self { | ||
unreachable!() | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is it actually used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. A quick test build says "this is unused".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's used in xinput. It could probably be eliminated with some more special casing in the code generator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I forgot to actually enable the necessary features in my test build. 🤦
New try says:
error[E0271]: type mismatch resolving `<u8 as std::convert::TryFrom<u8>>::Error == errors::ParseError`
--> src/protocol/xinput.rs:8798:29
|
8798 | let format = format.try_into()?;
| ^^^^^^^^ expected enum `std::convert::Infallible`, found enum `errors::ParseError`
error[E0271]: type mismatch resolving `<u8 as std::convert::TryFrom<u8>>::Error == errors::ParseError`
--> src/protocol/xinput.rs:13546:29
|
13546 | let format = format.try_into()?;
| ^^^^^^^^ expected enum `std::convert::Infallible`, found enum `errors::ParseError`
error: aborting due to 2 previous errors
This is inside ChangeDevicePropertyRequest
and XIChangePropertyRequest
In both cases, a format
u8
is converted to u8
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah we could probably try harder to avoid that u8 -> u8 conversion if we wanted to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that this try_into
comes from emit_field_post_parse
, which would normally convert it into an enum. But here it is "converted" into an u8
because it is a deducible field.
In emit_struct_type
, this is avoided with:
if !field
.name()
.map(|field_name| deducible_fields.contains_key(field_name))
.unwrap_or(false)
{
self.emit_field_post_parse(field, out);
}
Ok, I think the biggest remaining piece is around request length and validation. I'm not familiar with the edge cases in X here. The X spec says
Obviously if you provide fewer bytes than necessary parsing will fair when we run out of bytes. If you provide more bytes than necessary what happens? e.g. if I send a |
A request cannot be directly parsed from a stream of bytes because it requires other context: the extension opcodes and whether or not the BigRequests extension is enabled. We'll parse that out first, and then call code that's stateless to parse the entire request.
This fixes all remaining warnings.
These are taken from some local X traffic.
…will panic if it's short. Thanks to validate_request_header this is largely a theoretical concern, but if we change that someday it'd be good not to crash.
add_ne!(body, 0x00000000u32); | ||
body.push(0x01); | ||
body.extend(&[0x00, 0x00, 0x00]); // Final padding. | ||
let r = CreateWindowRequest::try_parse_request(header, &body).unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does anyone have access to a big endian machine or a good idea on how we could test this on CI? The best idea that I have so far would be qemu
and I bet that would be quite slow.
Alternatively, I am fine with just assuming this works. That approach also worked so far. :-)
TL;DR: It's complicated and we cannot know in general how individual requests behave. GetInputFocus is one of the requests where the Xorg server uses The macro checks that the request size matches exactly the expected value and if not, it will return a BadLength error: The calling code will however still consume the full 26*4 bytes that you claim your request has. I do not have a code place to point at in here, but its this function: This functions gets the request length out of the buffer of data that was already read. https://gitlab.freedesktop.org/xorg/xserver/-/blob/master/dix/dispatch.c#L465-500 As an example of a request that allows trailing data (as already mentioned above): https://gitlab.freedesktop.org/xorg/xserver/-/blob/master/dix/dispatch.c#L3405-3412 |
Ok, that's rather unfortunate that X is not consistent here. |
…est_header and check_exhausted functions.
Ok I think this is ready for another pass. The main issue you brought up that I haven't changed at all is
The opcodes and the request length all have to be extracted before calling a given I do plan to generate a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 from me
Excellent, thanks! |
I'd suggest reviewing the individual commits rather than the full diff that Github wants you to look at. The division is fairly logical, although the "Implement parsing for request fields" changeset ends up being a giant blob. There are tests for a few simple requests showing that this works at least some times.