Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feedback - Pest isn't elegant #489

Closed
ColonelThirtyTwo opened this issue Jan 28, 2021 · 2 comments
Closed

Feedback - Pest isn't elegant #489

ColonelThirtyTwo opened this issue Jan 28, 2021 · 2 comments

Comments

@ColonelThirtyTwo
Copy link

This isn't a bug per se, more of my thoughts on pest after trying to use it. I apologize if it seems like a rant.

tldr; instead of generating data structures for a proper AST, pest gives you a very generic "hierarchy of pairs" interface which is neither convenient to work with nor type safe.

Pest IMO is not elegant to use. While specifying the language via the pest grammar is fine, the architecture of the returned result - the Pairs and Rules - is overly simplistic and requires a lot of boilerplate, because the "constraints" specified in the grammar are not present in their generic API.

For example, consider the rule expr_plus = {"+" ~ integer ~ integer}. By the grammar definition, any text matching this rule is going to produce exactly two sub-pairs, both with the integer rule. However, that information isn't available on the rust side - instead it gets a Pair<Rule>, which can contain zero or more inner pairs that you now have to manually retrieve out from an iterator.

You may say that that's not too bad - just get the iterator and call .next().unwrap() two times... but now you are making an unchecked assumption that the expr_plus will always produce two integer pairs. And unchecked assumptions mean bugs later down the line. Consider if I change the rule to expr_plus = {"+" ~ (integer)*} or expr_plus = {"+" ~ int_or_float ~ int_or_float } - in both cases, the existing rust code that handles those rules will compile fine, because the interface has not changed, but the unchecked assumptions are now violated, either because the expr_plus rule now has more or less inner pairs than the code was expecting or because the pairs are no longer only integers.

In my view, a truly "elegant" parser would emit Rust data structure based on the rules given via the grammar. For instance, ExprPlus = {"+" ~ first:Integer ~ second:Integer} would generate a struct ExprPlus { pub first: Integer, pub second: Integer }. Not only is this more convenient to work with, as the Rust code can simply access the fields (or better yet, destructure the rule), changing the grammar will also change the generated Rust structures, causing appropriate compiler warnings and errors based on the new grammar.

I hope that my feedback motivates the devs to change things. I do like specifying my grammars in a specific PEG DSL, unlike in say nom, which is excessively verbose and requires handling comments and whitespace manually. But the pain of using the Pairs output that pest provides is too great for me to use this project.

@ColonelThirtyTwo
Copy link
Author

After writing this, I discover the pest_consume crate, which is... a step in the right direction at least.

@dragostis
Copy link
Contributor

@ColonelThirtyTwo, thank you for taking your time to write this. This is exactly what I have in mind for pest3. Unfortunately, I've been quite busy lately and haven't had time to invest more time into making it a reality. However, I have a clear path forward if anyone would be interested to contribute.

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants