-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support non-utf-8 file names #210
Comments
Yup, this is a silly bug. Note that if you run pub fn parse() -> Result<Args> {
// Get all of the arguments, being careful to require valid UTF-8.
let mut argv = vec![];
for arg in env::args_os() {
match arg.into_string() {
Ok(s) => argv.push(s),
Err(s) => {
errored!("Argument '{}' is not valid UTF-8. \
Use hex escape sequences to match arbitrary \
bytes in a pattern (e.g., \\xFF).",
s.to_string_lossy());
}
}
}
let mut raw: RawArgs =
Docopt::new(USAGE)
.and_then(|d| d.argv(argv).version(Some(version())).decode())
.unwrap_or_else(|e| e.exit()); The specific reason why I did this was to catch invalid UTF-8 in patterns, but of course, this catches invalid UTF-8 everywhere. Finally, Docopt has this:
... and the argv parser is built around Sigh. I should probably fix this inside of Docopt, but I also want to switch to clap. See #136. |
Yes, clap makes it possible to do this correctly, it's nice that way. |
The reason I got to this point was that I'm trying to make a way to list matches in rg in latest-modified-first order. First trying to just feed it files in that order. |
Oh interesting. You'll also need to pass |
There were two important reasons for the switch: 1. Performance. Docopt does poorly when the argv becomes large, which is a reasonable common use case for search tools. (e.g., use with xargs) 2. Better failure modes. Clap knows a lot more about how a particular argv might be invalid, and can therefore provide much clearer error messages. While both were important, (1) made it urgent. Note that since Clap requires at least Rust 1.11, this will in turn increase the minimum Rust version supported by ripgrep from Rust 1.9 to Rust 1.11. It is therefore a breaking change, so the soonest release of ripgrep with Clap will have to be 0.3. There is also at least one subtle breaking change in real usage. Previous to this commit, this used to work: rg -e -foo Where this would cause ripgrep to search for the string `-foo`. Clap currently has problems supporting this use case (see: clap-rs/clap#742), but it can be worked around by using this instead: rg -e [-]foo or even rg [-]foo and this still works: rg -- -foo This commit also adds Bash, Fish and PowerShell completion files to the release, fixes a bug that prevented ripgrep from working on file paths containing invalid UTF-8 and shows short descriptions in the output of `-h` but longer descriptions in the output of `--help`. Fixes #136, #189, #210, #230
File names on linux are just bags of bytes. Regular shell tools work with them no problem:
The main annoyance is that it poisons the whole directory with ripgrep:
Doing the right thing should be easy - an argument is always an OsStr when it enters the rust program, which is losslessly converted to a
Path
.The text was updated successfully, but these errors were encountered: