Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adding support for shortest choice #28

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

leiyangyou
Copy link

Hi,

Hopefully this is useful for someone (A ShortestChoice variant of OrderedChoice)

@igordejanovic
Copy link
Member

Hi. Thanks for the contribution.
Can you give an example of why this kind of expression would be useful? What is your use-case?

if result is not None:
result = [result]
result_str = "".join([x.flat_str() for x in flatten(result)])
if (not shortest_result) or (len(shortest_result_str) > len(result_str)):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be done more optimal. Flattening and converting to string of all resulting subtrees is slow.
It is enough to use position and track in which branch you have smaller advance.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will make a patch for this

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make a unit test also if you can. I am trying to keep test coverage as high as possible. Thanks.

@leiyangyou
Copy link
Author

I had to use this while building a pretty complex grammar for tagging parts of a school name
(a PEG parser may not have been the best tool for this use case)

It's an equivalent of the non greedy version of regexp operations

say if you have grammar x and y
x matches 'ab'
y matches 'a'

given input 'ab', ShortestChoice([x, y]) will match 'a' while OrderedChoice([x, y]) will match 'ab'

In my case, I did not know upfront which of x y matches a shorter input, therefore something like ShortestChoice has to be used.

@igordejanovic
Copy link
Member

I see. I think that a similar variant LongestChoice would be event more useful and very similar in implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants