Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Word borders don't seem to be working with split #204

Closed
ghost opened this issue Apr 18, 2016 · 2 comments
Closed

Word borders don't seem to be working with split #204

ghost opened this issue Apr 18, 2016 · 2 comments
Labels

Comments

@ghost
Copy link

ghost commented Apr 18, 2016

I expect this code:

fn main() {
    let re = Regex::new(r"\b").unwrap();
    let v: Vec<&str> = re.split("Should this (work?)").collect();
    println!("{:?}", v);
}

To output this:

["", "Should", " ", "this", " (", "work", "?)"]

But it outputs this:

["", "Should", " this", " ", "(work", "?", ")"]

Meta

rustc 1.8.0 (db2939409 2016-04-11)
binary: rustc
commit-hash: db2939409db26ab4904372c82492cd3488e4c44e
commit-date: 2016-04-11
host: x86_64-apple-darwin
release: 1.8.0
@ghost ghost changed the title Word borders don't seem to be working Word borders don't seem to be working with split Apr 18, 2016
@BurntSushi BurntSushi added the bug label Apr 23, 2016
@BurntSushi
Copy link
Member

This is indeed a bug. It turns out to be a bug in how \b is being matched, which in turn is related to the recent change that \b is now speculatively matched by the DFA, which is getting it wrong. Working on it now.

@BurntSushi
Copy link
Member

Yup. I was totally mishandling how word boundaries were being computed for start states. A fix is in PR #211.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant