Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for regex quantifiers #121

Merged
merged 33 commits into from
Mar 31, 2023
Merged

Conversation

eliotwrobson
Copy link
Collaborator

@eliotwrobson eliotwrobson commented Dec 21, 2022

Pull request for quantifiers, resolves #109

@eliotwrobson eliotwrobson changed the base branch from main to develop December 21, 2022 04:47
@coveralls
Copy link

coveralls commented Dec 21, 2022

Coverage Status

Coverage: 100.0%. Remained the same when pulling 2e96524 on eliotwrobson:quantifiers into f5ed05a on caleb531:develop.

@eliotwrobson
Copy link
Collaborator Author

@caleb531 Not quite done yet, but an initial version of the quantifier syntax is working. The main obstacle I'm running into right now has to do with the way that the default input symbols are retrieved. The issue is that to create the factory function for the wildcard, the lexer needs to have all of the non-reserved symbols that appear in a given regex.

However, with the default method of retrieving the input symbols, numbers used in the quantifiers will get picked up even though they don't appear as literals in the given regex. Because of the difference between where the factory function is defined and called, this creates a potential sharp edge that may cause issues for some people. Do you have any thoughts on this? There are a couple of ways to get around that but I'm not quite sure which one would be best.

automata/regex/parser.py Outdated Show resolved Hide resolved
automata/regex/regex.py Outdated Show resolved Hide resolved
@eliotwrobson
Copy link
Collaborator Author

@caleb531 looking at this again, how do you feel about changing the default input symbols for creating a regex to something very general, like all alphas + numbers? I think it removes the awkwardness of having the default regex to NFA conversion be only over the symbols in present literals. If not, I think there's a workaround to the wildcard creation issue that keeps this same default, just makes the creation code a little messier.

@eliotwrobson
Copy link
Collaborator Author

eliotwrobson commented Mar 29, 2023

@caleb531 going to go ahead and flip this to open now that I had the chance to clean it up. The only controversial thing here is changing the default alphabet used for parsing regex, but I think this actually makes it easier if someone is just playing around with regexes. Performance/memory usage shouldn't be an issue since people can still manually set the input symbols.

Haven't updated docs completely, will do once this gets closer to being merged.

@eliotwrobson eliotwrobson marked this pull request as ready for review March 29, 2023 06:42
@caleb531
Copy link
Owner

Haven't updated docs completely, will do once this gets closer to being merged.

@eliotwrobson Just reviewed this PR again—everything looks good. Did you still have any documentation updates to make before I merge?

@eliotwrobson
Copy link
Collaborator Author

@caleb531 nope! I added the docs updates to the PR already (just noting the quantifier inclusion and changing the default alphabet), so I think this is good to go. If we feel like there's more docs needed, that can be done before this goes on main.

@caleb531
Copy link
Owner

@eliotwrobson Perfect! Will approve and merge now, then. Thank you for your work on this functionality!

@caleb531 caleb531 changed the title Quantifiers Add support for regex quantifiers Mar 31, 2023
@caleb531 caleb531 merged commit 22815c2 into caleb531:develop Mar 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants