Implement request parsing #452

khuey · 2020-05-27T01:10:58Z

I'd suggest reviewing the individual commits rather than the full diff that Github wants you to look at. The division is fairly logical, although the "Implement parsing for request fields" changeset ends up being a giant blob. There are tests for a few simple requests showing that this works at least some times.

psychon · 2020-05-27T06:08:02Z

So far I only looked at half the first commit and I feel like something will be wrong. Could you add a test for xproto's ChangeSaveSet (or alternatively just tell me that it will work). This request has two fields: mode: u8 and window: Window. On the wire, this ends up as:

typedef struct xcb_change_save_set_request_t {
    uint8_t      major_opcode;
    uint8_t      mode;
    uint16_t     length;
    xcb_window_t window;
} xcb_change_save_set_request_t;

I feel like your code might mis-handle this, because with your abstraction, the mode ends up as part of the header.

Am I wrong?

(Also, clippy is complaining about one thing)

src/x11_utils.rs

psychon

Sorry for my many complaints.

"Big ones" should be: Can we remove RequestHeader and the checking that the input slice is exhausted at the end of parsing?

src/x11_utils.rs

generator/src/generator/namespace.rs

psychon · 2020-05-27T06:26:04Z

generator/src/generator/namespace.rs

                    } else {
-                        outln!(out, "let {} = remaining.to_vec();", rust_field_name);
-                        outln!(out, "let remaining = &remaining[remaining.len()..];");
+                        // Only force taking ownership for non-request types.


(unrelated to this PR) What would it take to make "normal struct parsing" also be able to borrow stuff from the input? The hand-written API moved away from this being possible, but how close is the code generator?

I'd have to look at some of the generated code but I don't think it would be particularly difficult at this point.

tests/request_parsing_tests.rs

psychon · 2020-05-27T06:50:27Z

tests/request_parsing_tests.rs

+#[test]
+fn test_randr_get_output_property() {
+    use x11rb::protocol::randr::GetOutputPropertyRequest;
+    let header = RequestHeader {


(Comment is at a random position) Why do we need RequestHeader?

I think that it might be possible to do without. The parsing code could just parse the request "normally" without skipping any fields. Opcodes can then be compared against expected values. Things that are not needed can be sent to let _ =.

RequestHeader does not bring anything to the parsing code, does it?

Edit: Well... it hides BigRequests support somewhat. How bad would it be to put BigRequests support into the generated code "everywhere"? We already do that for request serialisation... kind of but not really (it uses a request size of zero and expects something else to fix things up and add a second request size).
For parsing, this would however be relatively straight forward:

let (short_length, mut remaining) = u16::try_parse(remaining)? let length = if short_length == 0 { // BigRequests let (long_length, new_remaining) = u32::try_parse(remaining)?; remaining = new_remaining; long_length } else { short_length.into() }

Well... okay, that is surprisingly long. Perhaps a helper function in x11_utils for parsing the request length field?

The main reason I did it this way is that parsing requests is not stateless. You need to know the state of the connection with the server (in particular whether or not BigRequests is enabled and what extensions are enabled at what major opcodes) in order to correctly parse requests. We could choose to decide that we don't care about tracking whether BigRequests is enabled and simply always accept that format. However, even if we did that, someone outside of a FooRequest::try_parse implementation is always going to have to look at the first two bytes of the request to determine which parsing implementation to feed the buffer to (and for any opcodes above 127 they need to have been parsing the connection all along to know which extension opcodes exist). So RequestHeader is currently all the stuff that requires external state to process.

I'm not necessarily opposed to changing it but that is the motivation for the current design.

This is kind of similar to x11rb::protocol::Event and Error. That code also needs to look up extensions based on error code / response type / major opcode. I guess we should also do something similar for request parsing (but that can be done in another PR).

This code uses some ad-hoc functions at the end of the file to extract the necessary info from the &[u8].

A possible API for x11rb::protocol::Request would then be fn parse(request: &[u8], ext_info_provider: &dyn ExtInfoProvider) -> Result<Self, ParseError> (this is just the existing API for Event/Error).
Your version would be -> Result<(RequestHeader, Self), ParseError>, right? However... does the request header provide anything useful here? The request is already identified by the variant of the enum Self that is being returned. The request length is known to the caller as .len() of the provided slice. The difference between major/minor opcode does not really matter, I think.

What kind of API do you have in mind for this? Or did you plan to hand-write the necessary bits outside of x11rb?

in particular whether or not BigRequests is enabled

Well... a big request is an illegal request when BigRequests is disabled, since a length of 0 would mean that there is not even enough space for the length field... So, personally I don't mind to assume that BigRequests is always enabled. Technically, this could also check the maximum request length, but this seems to be the wrong code to do that (since the request was already received at this point and is in-memory, such handling should be done before everything is read).

I also now looked at the X11 server and actually... there are some requests where it requires an exact length and sometimes where "trailing garbage" is okay, There are several macros for this in include/dix.h:

REQUEST_SIZE_MATCH, 860

REQUEST_AT_LEAST_SIZE, 346

REQUEST_AT_LEAST_EXTRA_SIZE, 3

REQUEST_FIXED_SIZE, 107

For some reason, ProcNoOperation uses REQUEST_AT_LEAST_SIZE and does not do an exact match. The number after the macro in the list above is the result of git grep MACRO | wc -l. Seems like: It's complicated

Shorter way to parse length fields:

let (length, remaining) = match u16::try_parse(remaining)? { // BigRequests? (0, remaining) => u32::try_parse(remaining), (n, remaining) => (n.into(), remaining), }

Something else I realized is that regardless of our decision around BigRequests whoever is driving parsing has to parse the length themselves anyways to know if they have received enough bytes before calling the parsing method. Since ParseError doesn't distinguish between various types of errors "insufficient bytes" is otherwise indistinguishable from other types of parsing errors.

Yup. That's also how the existing code in e.g. RustConnection parses errors/events/replies.

Once upon a time, the only possible parse error was "not enough data". Since then, a lot has happened. I wouldn't mind improving ParseError, but so far I do not really see a good use case for that. Our parsing errors most likely would cause memory bugs in anything with XCB, so apparently all of this "never happens" anyway...

src/protocol/composite.rs

tests/request_parsing_tests.rs

eduardosm · 2020-05-27T10:02:32Z

src/errors.rs

+impl From<std::convert::Infallible> for ParseError {
+    fn from(_: std::convert::Infallible) -> Self {
+        unreachable!()
+    }
+}


Where is it actually used?

Good question. A quick test build says "this is unused".

It's used in xinput. It could probably be eliminated with some more special casing in the code generator.

I forgot to actually enable the necessary features in my test build. 🤦
New try says:

error[E0271]: type mismatch resolving `<u8 as std::convert::TryFrom<u8>>::Error == errors::ParseError` --> src/protocol/xinput.rs:8798:29 | 8798 | let format = format.try_into()?; | ^^^^^^^^ expected enum `std::convert::Infallible`, found enum `errors::ParseError` error[E0271]: type mismatch resolving `<u8 as std::convert::TryFrom<u8>>::Error == errors::ParseError` --> src/protocol/xinput.rs:13546:29 | 13546 | let format = format.try_into()?; | ^^^^^^^^ expected enum `std::convert::Infallible`, found enum `errors::ParseError` error: aborting due to 2 previous errors

This is inside ChangeDevicePropertyRequest and XIChangePropertyRequest In both cases, a format u8 is converted to u8.

Yeah we could probably try harder to avoid that u8 -> u8 conversion if we wanted to.

I believe that this try_into comes from emit_field_post_parse, which would normally convert it into an enum. But here it is "converted" into an u8 because it is a deducible field.

In emit_struct_type, this is avoided with:

if !field .name() .map(|field_name| deducible_fields.contains_key(field_name)) .unwrap_or(false) { self.emit_field_post_parse(field, out); }

generator/src/generator/namespace.rs

khuey · 2020-05-27T18:28:54Z

Ok, I think the biggest remaining piece is around request length and validation. I'm not familiar with the edge cases in X here. The X spec says

The length field defines the total length of the request, including the header. The length field in a request must equal the minimum length required to contain the request. If the specified length is smaller or larger than the required length, an error is generated

Obviously if you provide fewer bytes than necessary parsing will fair when we run out of bytes. If you provide more bytes than necessary what happens? e.g. if I send a GetInputFocus (which has no body) with length == 26 and I send 100 bytes of padding with it will the X server accept that? Per the language of the spec it shouldn't, but I haven't actually tested it. Do you know the answer?

A request cannot be directly parsed from a stream of bytes because it requires other context: the extension opcodes and whether or not the BigRequests extension is enabled. We'll parse that out first, and then call code that's stateless to parse the entire request.

This fixes all remaining warnings.

These are taken from some local X traffic.

…will panic if it's short. Thanks to validate_request_header this is largely a theoretical concern, but if we change that someday it'd be good not to crash.

psychon · 2020-05-27T19:27:45Z

tests/request_parsing_tests.rs

+    add_ne!(body, 0x00000000u32);
+    body.push(0x01);
+    body.extend(&[0x00, 0x00, 0x00]); // Final padding.
+    let r = CreateWindowRequest::try_parse_request(header, &body).unwrap();


Does anyone have access to a big endian machine or a good idea on how we could test this on CI? The best idea that I have so far would be qemu and I bet that would be quite slow.

Alternatively, I am fine with just assuming this works. That approach also worked so far. :-)

psychon · 2020-05-27T19:43:41Z

e.g. if I send a GetInputFocus (which has no body) with length == 26 and I send 100 bytes of padding with it will the X server accept that? Per the language of the spec it shouldn't, but I haven't actually tested it. Do you know the answer?

TL;DR: It's complicated and we cannot know in general how individual requests behave.

GetInputFocus is one of the requests where the Xorg server uses REQUEST_SIZE_MATCH to check that sizeof(some struct) matches the size of the request.

https://gitlab.freedesktop.org/xorg/xserver/-/blob/4195e8035645007be313ade79032b8d561ceec6c/dix/events.c#L4847

The macro checks that the request size matches exactly the expected value and if not, it will return a BadLength error:

https://gitlab.freedesktop.org/xorg/xserver/-/blob/4195e8035645007be313ade79032b8d561ceec6c/include/dix.h#L69-71

The calling code will however still consume the full 26*4 bytes that you claim your request has. I do not have a code place to point at in here, but its this function:

https://gitlab.freedesktop.org/xorg/xserver/-/blob/4195e8035645007be313ade79032b8d561ceec6c/os/io.c#L227

This functions gets the request length out of the buffer of data that was already read.
The caller of this one then follows some function pointer to actually handle the request:

https://gitlab.freedesktop.org/xorg/xserver/-/blob/master/dix/dispatch.c#L465-500

As an example of a request that allows trailing data (as already mentioned above): ProcNoOperation uses REQUEST_AT_LEAST_SIZE, so allows extra data:

https://gitlab.freedesktop.org/xorg/xserver/-/blob/master/dix/dispatch.c#L3405-3412

khuey · 2020-05-27T19:51:57Z

Ok, that's rather unfortunate that X is not consistent here.

…est_header and check_exhausted functions.

khuey · 2020-05-27T22:51:46Z

Ok I think this is ready for another pass.

The main issue you brought up that I haven't changed at all is RequestHeader. The intended usage of this code is (pseudo code)

while (have_some_bytes) {
    let header = parse_request_header(bytes);
    if length_indicated_by_header < number_of_bytes_we_have {
        wait_for_more_bytes();
    }
    let request = match (major_opcode_from_header, minor_opcode_from_header) {
      ... => invoke the relevant parsers
    };
    do_something(request);
}

The opcodes and the request length all have to be extracted before calling a given FooRequest::try_parse_request. The major and minor opcodes are needed to choose the correct FooRequest, and the request length is needed to make sure that we have received enough bytes from the socket or whatever before calling into try_parse_request. Because "not enough data" is indistinguishable from other parsing problems with the current error reporting setup, we have to preclude that before attempting to parse to know that any errors we receive are "real". Given that somebody outside has to look at these fields anyways, I think the current design where they're parsed externally into the RequestHeader struct and the try_parse_request function received a copy of the struct plus the remaining input bytes makes sense.

I do plan to generate a Request enum that contains all the different request variants (matching those for Error and Event) and a function on that that takes the input and an ExtInfoProvider and parses out the header, selects the appropriate request variant, and invokes its try_parse_request function. But not in this PR.

psychon

👍 from me

khuey · 2020-05-28T13:43:44Z

Excellent, thanks!

psychon reviewed May 27, 2020

View reviewed changes

src/x11_utils.rs Show resolved Hide resolved

psychon requested changes May 27, 2020

View reviewed changes

psychon mentioned this pull request May 27, 2020

Implement randomised round-trip testing #453

Open

eduardosm reviewed May 27, 2020

View reviewed changes

tests/request_parsing_tests.rs Outdated Show resolved Hide resolved

eduardosm reviewed May 27, 2020

View reviewed changes

generator/src/generator/namespace.rs Outdated Show resolved Hide resolved

khuey force-pushed the request-parsing branch from 365949e to d7e0473 Compare May 27, 2020 18:22

khuey force-pushed the request-parsing branch from d7e0473 to fb71aba Compare May 27, 2020 18:33

khuey added 17 commits May 27, 2020 11:34

Filter out request fields already parsed into the header.

63308d9

Fix xproto::Gravity special casing to work with imports.

b065d62

Generate try_parse methods for Aux structs.

8540c4e

Add a "conversion" from Infallible to ParseError.

733a784

Implement parsing for request fields.

71b4237

Dispense with odd_length, which we don't care about at all.

20081e3

Check padding remaining after request parsing.

4007467

This fixes all remaining warnings.

Add a few tests for request parsing.

4bce7d6

These are taken from some local X traffic.

When checking the final padding don't index the slice directly which …

8892230

…will panic if it's short. Thanks to validate_request_header this is largely a theoretical concern, but if we change that someday it'd be good not to crash.

Make clippy happy.

93e38a8

Fix request parsing tests to do endian conversion.

6a7a2f5

Fix odd_length handling.

191e4f7

Silence clippy warning.

d4ecc46

Fix bad opcode test to be entirely correct except for a bad opcode.

52f0f6f

Consolidate list parsing generation code.

1e6fc09

Save a line on bool parsings.

6ae0b81

khuey force-pushed the request-parsing branch from fb71aba to 6ae0b81 Compare May 27, 2020 18:35

psychon reviewed May 27, 2020

View reviewed changes

khuey added 3 commits May 27, 2020 13:53

Stop doing request length validation and get rid of the validate_requ…

4dd8f67

…est_header and check_exhausted functions.

More clippy fixes.

e7e1f12

Tweak comment as requested.

26a6d34

khuey force-pushed the request-parsing branch from eaadd09 to 26a6d34 Compare May 27, 2020 21:29

Add a helper function for parsing request headers.

3906f0f

khuey requested a review from psychon May 27, 2020 22:52

psychon requested a review from eduardosm May 28, 2020 07:10

psychon reviewed May 28, 2020

View reviewed changes

eduardosm approved these changes May 28, 2020

View reviewed changes

mergify bot merged commit bd46246 into psychon:master May 28, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement request parsing #452

Implement request parsing #452

khuey commented May 27, 2020

psychon commented May 27, 2020

psychon left a comment

psychon May 27, 2020

khuey May 27, 2020

psychon May 27, 2020 •

edited

Loading

khuey May 27, 2020

psychon May 27, 2020

psychon May 27, 2020

psychon May 27, 2020

khuey May 27, 2020

psychon May 27, 2020

eduardosm May 27, 2020

psychon May 27, 2020

khuey May 27, 2020

psychon May 27, 2020

khuey May 27, 2020

eduardosm May 27, 2020

khuey commented May 27, 2020

psychon May 27, 2020

psychon commented May 27, 2020

khuey commented May 27, 2020

khuey commented May 27, 2020 •

edited

Loading

psychon left a comment

khuey commented May 28, 2020

Implement request parsing #452

Implement request parsing #452

Conversation

khuey commented May 27, 2020

psychon commented May 27, 2020

psychon left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

psychon May 27, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

khuey commented May 27, 2020

Choose a reason for hiding this comment

psychon commented May 27, 2020

khuey commented May 27, 2020

khuey commented May 27, 2020 • edited Loading

psychon left a comment

Choose a reason for hiding this comment

khuey commented May 28, 2020

psychon May 27, 2020 •

edited

Loading

khuey commented May 27, 2020 •

edited

Loading