Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Rust] Support mutation #164

Open
Zk2u opened this issue Jan 14, 2023 · 6 comments
Open

[Rust] Support mutation #164

Zk2u opened this issue Jan 14, 2023 · 6 comments

Comments

@Zk2u
Copy link
Contributor

Zk2u commented Jan 14, 2023

Support mutating a deserialised object as seen in google/flatbuffers#5772

Maybe add setters to [x]Ref to set fields?

@TethysSvensson
Copy link
Collaborator

It wouldn't be quite as simple. All deserialized objects currently contain a &[u8] slice inside them. To support mutability we would have to change that to either &mut [u8] or &[Cell<u8>] -- but if we do that, then users would no longer be able to deserialize immutable slices.

I think a better approach would be to add additional [x]Mut types, which support mutability. We can then make a decision about whether a &mut [u8] or &[Cell<u8>] would be better. I &mut [u8] is potentially faster, but is also less ergonomic.

One compromise could be to make the reference types generic so they could work with multiple slice types.

@TethysSvensson
Copy link
Collaborator

Another option would be to improve the Builder to support loading in pre-serialized objects and then mutate at them. I'm thinking we could extend it with three features that combined gives you what you are asking for (and more):

  • Allow deserializing a partial message inside a non-finished Builder. When you use this API you would not get an [x]Ref<'a> out, but rather an Offset<[x]> without a lifetime.
  • Allow poking at the primitives in the builder using the same API. This would work, since you already have a &mut Builder.
  • Allow memcpying an existing message into a builder while assert that it has a certain type.

All of these functions should suitably marked as being be sound (as in no segfaults), but not safe (as in it will allow invalid messages and/or wrong interpretation of messages).

@TethysSvensson
Copy link
Collaborator

As a work-around, perhaps something like this could be of use?

table Object {
  counter: uint32;
}
use object::{Object, ObjectRef};
use planus::ReadAsRoot;

#[path = "object_generated.rs"]
mod object;

fn mutate_object(input: &[u8]) -> planus::Result<Vec<u8>> {
    let object_ref: ObjectRef<'_> = ObjectRef::read_as_root(input)?;
    let mut object: Object = object_ref.try_into()?;
    object.counter += 1;

    let mut builder = planus::Builder::new();
    let finished = builder.finish(object, None);
    Ok(finished.to_vec())
}

Note however [x]Ref<'_> can have a lot of object reuse, while there is no such reuse for [x]. This means that the generated output might be bigger than the input. For some schemas it might even be exponentially bigger if the input is constructed maliciously. This would also mean exponential runtime and exponential memory use.

@Zk2u
Copy link
Contributor Author

Zk2u commented Jan 18, 2023

As a work-around, perhaps something like this could be of use?

Yep, this is what I'm currently doing. It's not as optimal as a mutation API, but it works currently.

@Zk2u
Copy link
Contributor Author

Zk2u commented Jan 19, 2023

All of these functions should suitably marked as being be sound (as in no segfaults), but not safe (as in it will allow invalid messages and/or wrong interpretation of messages).

  • Allow deserializing a partial message inside a non-finished Builder. When you use this API you would not get an [x]Ref<'a> out, but rather an Offset<[x]> without a lifetime.
  • Allow poking at the primitives in the builder using the same API. This would work, since you already have a &mut Builder.
  • Allow memcpying an existing message into a builder while assert that it has a certain type.
    Is there any way we can have an option to catch errors during this? This would be O(n) to check through, but in some cases that's acceptable.

Perhaps a better way to mutate data in a client-server architecture (where buffers from the client are untrusted) is to send over the field modifications from the client instead. This would make updates to large buffers a lot faster as compared to a O(n) validation of the new buffer. Mutations would then only happen in a trusted environment where safety isn't so much of an issue. Would need to write this "diff" format as another FBS schema. You could then load part or the whole buffer that's being modified and mutate the fields without changing or deserialising anything else if not needed.

Could mutation work with non-scalar values if you load the whole message? Then inserting the new value into the offset and shifting the rest of the message to the right (perhaps updating any offsets as needed?).

Also, if none of this makes sense, please forgive me. I don't know much about the internals of FBs yet. :)

@Zk2u
Copy link
Contributor Author

Zk2u commented Jan 23, 2023

The above is very much usecase specific for us. Hopefully this is a better reply...

Another option would be to improve the Builder to support loading in pre-serialized objects and then mutate at them. I'm thinking we could extend it with three features that combined gives you what you are asking for...

Extending the builder would probably mean smaller code size too, as I don't think you'd need extra [x]Mut types in the generated code. What would the interface to this look like using the builder to mutate the types?

I'm guessing it would be a bit clumsier to use than something like the syntax used for creating messages currently but we want it to be developer friendly too.

All of these functions should suitably marked as being be sound (as in no segfaults), but not safe (as in it will allow invalid messages and/or wrong interpretation of messages).

Validating the message in an O(n) way when loading would mean using [x]Mut types as the generic builder code doesn't know about the specific message schemas.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants