how to iterate over matches where the next match must start immediately following the previous match? #888
-
The docs at https://docs.rs/regex/1.6.0/regex/#example-iterating-over-capture-groups show nicely how to iterate over a longer string. Is there a Anchor similar to ^ or \A that requires the regex to match at exactly the previous iteration ? ^ and \A seem to make the match fail when used with captures_iter. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
No, there is no such anchor. The reason why is because such a thing is not really a property of the regex or its match semantics. A regex itself doesn't have any concept of iteration. Iteration is a protocol built on top of the regex pattern. So in order to do something like that, you can't use use regex::Regex;
fn main() {
let mut haystack = "abcdefghi jklmnopqr";
let re = Regex::new(r"^\w{3}").unwrap();
while let Some(caps) = re.captures(haystack) {
dbg!(&caps);
// This unwrap is OK because '0' corresponds to the overall match,
// and we clearly have a match. Note though, that this logic only works
// if the regex can never produce a zero-width match. Clearly, '\w{3}'
// can never produce an empty match.
haystack = &haystack[caps.get(0).unwrap().end()..];
}
} Playground: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=3ce26d6073a5b1637eb5086c7fa7252c |
Beta Was this translation helpful? Give feedback.
No, there is no such anchor. The reason why is because such a thing is not really a property of the regex or its match semantics. A regex itself doesn't have any concept of iteration. Iteration is a protocol built on top of the regex pattern. So in order to do something like that, you can't use
captures_iter
. You'll have to roll your own iteration logic. For example, it's very easy to do if you can assume that your regex never produces any empty matches (if it can, then the logic below won't terminate):