Replies: 2 comments
-
If you can translate the regex with backreferences to an actual regular expressing, then, yes, of course, you could generate a DFA. But if you're talking about a regex with over a million different branches, then good luck. It probably won't go over well. And note that this crate has a limit on the total number of states in a DFA that it can generate (based on how many states can be represented in ~32 bits). That limit might seem constraining, but in practice, a DFA that big is going to take a very very long time to generate and generally be completely unwieldy to actually use.
Well, what would the API you want actually look like? I claim that it would basically amount to what's on the |
Beta Was this translation helpful? Give feedback.
-
If you're just doing this "for fun," then one very quick way to do this would be to just fork |
Beta Was this translation helpful? Give feedback.
-
Hello! I understand that under most circumstances, a regex using backreferences cannot be recognized by a DFA since the language isn't regular. However, does this change if the strings recognized by that language are of a predetermined length?
My question is inspired by regex crosswords and a brief section on a blog post about solving them. The MIT crossword regex
.*(.)(.)(.)(.)\4\3\2\1.*
would normally be non-regular due to the backreferences but given the string is known to be of length 12, the regex can be expanded to....AAAAAAAA|...AAAAAAAA.|..AAAAAAAA..|.AAAAAAAA...|AAAAAAAA....|....AAABBAAA|...AAABBAAA.|etc.)
. There are a lot of possibilities here (4 * 26^4
or 1,827,904 by my math) but a finite number, so could a DFA be constructed?Given this hypothetical DFA, possibly even a
regex_automata::dfa::dense::DFA
, I then wanted to apply the ideas within another blog post, which performs DFA intersections, to create a linear DFA that recognizes the entire crossword. However, I don't see any APIs forDFA
that allow retrieving the states directly, only APIs for walking the DFA given character inputs. Do they exist?Another idea would be to take the DFA, perform a topological sort to find the set of possible characters at a given position in the input string, and perform constraint solving and searching/backtracking to solve the regex crossword. This would also require access to the DFA's states. Is this possible?
Sorry for what amounted to a bit of a ramble. The
regex
crate and friends is excellent and I wanted to evaluate it before I dove into building a custom DFA solution.Beta Was this translation helpful? Give feedback.
All reactions