Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding serde support(?) #1

Closed
saona-raimundo opened this issue Jul 26, 2021 · 17 comments
Closed

Adding serde support(?) #1

saona-raimundo opened this issue Jul 26, 2021 · 17 comments
Assignees

Comments

@saona-raimundo
Copy link

Hi! Thank you for the crate!

I was wondering if you could add serde support, maybe through a feature flag.

I am using your crate and I am missing serde support for FlatEx.
I noticed that you use smallvec. It has serde support through a feature flag, so it should be easy.

Thanks beforehand!

@bertiqwerty
Copy link
Owner

bertiqwerty commented Aug 1, 2021

Hi, interesting idea. I have not yet used serde before and I am open to implementing serde support. Just for my understanding, what is the benefit of serializing an instance of FlatEx over serializing the expression-string and parsing it after deserialization again? Just that you do not need to parse again or is there more?

@saona-raimundo
Copy link
Author

what is the benefit of serializing an instance of FlatEx over serializing the expression-string and parsing it after deserialization again? Just that you do not need to parse again or is there more?

As far as I understand, there is nothing more.

In my case, as a workaround, I did exactly as you said: Together with the FlatEx, I saved the original string and implemented serialization saving that string. Upon deserialization, I would parse again.

It is not my case, but I would assume that, in distributed computing, when you pass serialized information around, re-parsing each time becomes a waste of time. This brings me to the trade-off you might want to think about.

The implementation of serialization of FlatEx might have the following two extremes of time vs space efficiency (I do not know the details, so you might know better if it applies):

  • Serializing the original string and parsing on deserialization is space-efficient(?), but not time efficient.
  • Serializing the struct directly might use a bit more space (?), but it is time-efficient.

@bertiqwerty
Copy link
Owner

Thanks for your feedback. I will try it out.

@bertiqwerty bertiqwerty added this to the v0.8.2 milestone Aug 2, 2021
@bertiqwerty bertiqwerty self-assigned this Aug 2, 2021
@bertiqwerty
Copy link
Owner

I experimented a little with serde and stumbled across the problem that function pointers cannot be serialized.
the trait bound `fn(T) -> T: Deserialize<'_>` is not satisfied
But the operators such in FlatEx are all function pointers. One probably needs to serialize the function pointers. There is a crate called serde_closure that might be helpful. I need to think about this. If you have a good idea how to implement this let me know or even feel free to create a pull request.

@bertiqwerty bertiqwerty removed this from the v0.8.2 milestone Aug 2, 2021
@bertiqwerty
Copy link
Owner

bertiqwerty commented Aug 2, 2021

Implementing custom serialization/deserialization for FlatEx without serializing closures can be done conceptually more or less easily as long as no custom operators are used but only the default operators.

@saona-raimundo
Copy link
Author

Thanks for looking into it!!
I just now noticed that you are using function pointers fn(T) -> T for operators. This brings up another question totally different:

Do you want to support closures that move values in the operators?

If yes, then one can not use function pointers anymore, I found this discussion particularly helpful.

To be more concrete, do you want to support the following? (based on exmex v0.8.2)

use exmex::{parse, BinOp, Operator};

let v = 2;

let ops = [Operator {
    repr: "*v",
    bin_op: Some(BinOp {
        apply: move |a: i32, b: i32| a * b * v,
        prio: 1,
    }),
    unary_op: None,
}];
let to_be_parsed = "1 *v 1";
let expr = parse::<i32>(to_be_parsed, &ops)?;
assert_eq!(expr.eval(&[1])?, 2);

In any way, when people want to store functions or closures, what I see is something like Vc<Rc<dyn Fn(T) -> T>>. This would accept closures that move values. But this is not a solution (I think), because serializing a reference Rc does not save the function(?). That should be one of the reasons serde_closure exists.

Changing all function pointers for serde_closure::structs::Fn feels a bit excessive (and I do not the performance drawbacks it would have).

By the way, can one reconstruct the original expression from FlatEx? If so, serializing the original expression and parsing upon deserialization does not feel like a bad idea anymore (and you could add this unparse method on the way :) )

@bertiqwerty
Copy link
Owner

Thanks for your feedback. Interesting question. I like the simplicity of function pointers. If someone can make a strong case for closures that capture stuff from their surroundings, I will re-consider them.

Regarding serialization, I currently think in the direction of unparsing and re-parsing. FlatEx currently does not keep track of the used operator and variable representations in the string that was parsed. It only keeps function pointers and variable indices. I could use arbitrary representations. On the other hand, knowing the representation would be helpful for debugging.

@saona-raimundo
Copy link
Author

If someone can make a strong case for closures that capture stuff from their surroundings, I will re-consider them.

Sure, sounds fair :)

I currently think in the direction of unparsing and re-parsing.

Just to be sure, how is the re-parsing done? I am thinking of the operators needed to parse the expression, as they are not in the expression, how is one going to parse it?

If it is not possible, one might go for the following:

  • Implement Serialization for default operators.
  • Through trait bounds, implement serialization for FlatEx
  • Then, for user-supplied operators, they will have to implement serialization to serialize FlatEx... which I think is fair

@bertiqwerty
Copy link
Owner

bertiqwerty commented Aug 3, 2021

Yes. For re-parsing, one needs the operators.

  • One could simply store a reference to the input string and reparse it on deserialization with default operators and default literals.
    Not sure, how much value this would add.
  • One could also serialize default operators by their string-representations. On the other hand, parsing a not too long string just takes something like 10 μs. Not sure how beneficial this could be.

All operators, custom or default, have the same type. Not sure how one could implement different serializations.

@saona-raimundo
Copy link
Author

Okay! I think we came to the conclusion that, to add serde support, one would need to change all function pointers for serde_closure::structs::Fn.

With this, one could serialize the parsed expression together with operators (i.e. FlatEx), and simply restore up deserialization (no need to re-parse).

@bertiqwerty
Copy link
Owner

Do you have a use-case were re-parsing is too slow? Or can you think of one?

@saona-raimundo
Copy link
Author

I do not think it is a "common" case, but this is the closest to something useful I can think of. In short, "distributed recursive parsing", or "distributed tree construction".

Say you have a really big expression to parse. Then, you would like to do the following:

  • Find quickly a binary operator with high priority (while not parsing the whole input)
  • Send each sub-inputs to two workers to parse (you send only the substring(+operators))
  • Repeat until workers receive an expression without binary operators
  • The last workers parse the expression and send it back to their parent (here, FlatEx would be serialized)
  • The parent puts together the expressions from their childs (two deserializations and one serialization)
  • The root gives back the whole expression parsed

@bertiqwerty
Copy link
Owner

Interesting use case.

For the moment I have implemented just unparse and used it to implement Display, since I need this for debugging while developing new features anyway.

@saona-raimundo
Copy link
Author

Nice! Thank you!

@bertiqwerty bertiqwerty added this to the v0.9.0 milestone Aug 10, 2021
@bertiqwerty bertiqwerty removed this from the v0.9.0 milestone Aug 18, 2021
@bertiqwerty
Copy link
Owner

bertiqwerty commented Aug 27, 2021

Basic parse/unparse-serde-support added with Commit fc35d50.

With the possibility to create partial derivatives which are again expression, expressions can be created differently than just parsing a string. This changed my mind to implement at least basic parse/unparse serde support for default operators.

@bertiqwerty
Copy link
Owner

Basic support in Release v0.9.2 (Commit 167f948).

@saona-raimundo
Copy link
Author

Thanks a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants