-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ripgrep 14.1.0 exit silently when stdin is not connected (GNU parallel without --tty) #2806
Comments
This is the logic that detects whether stdin is readable or not: Lines 152 to 201 in b6ef99e
This was the most recent change to the stdin detection as far as I know: 020c5453a5880257c87949030550d24c596f5c8d Dealing with stdin is fundamentally a heuristic. And many programs do process control in a way that leaves stdin looking like there's something to read on it. This happened with neovim in #1892 and culminated in a change to neovim itself in neovim/neovim#14812.
It just can't. This is the downside of making |
I don't understand where that |
Yes, that must've been a copy/paste error on my part. I had originally invoked without
But in this case, I suspect that parallel doesn't connect a stdin at all. In this case, why not stop reading from stdin?
At the very least, I don't believe it should exit with -1 without stating what's wrong to the user. It took me too long to find that the workaround is to pass
I'm not sure why it tries to do this (twice?) and then exit(1) afterwards. I can't get Would it not be possible to just fall back to "don't read from stdin" if one cannot read from stdin? |
Stdin heuristic detection is complicated and opaque enough that it's worth having easy access to the complete story that leads ripgrep to decide whether to search stdin or not. Ref #2806
With #2807, I added some extra details to
The relevant lines for the
Specifically, notice that
The above logging demonstrates otherwise unfortunately.
ripgrep cannot tell the difference between "ripgrep searched something the user did not intend and thus should produce an error" and "ripgrep searched something the user intended and could not find a match."
Because There is no "crashing" here. I'm not sure why you reported it that way... Let's please flip the script here. Instead of saying "the current behavior is not ideal" (which I would absolutely agree with), please instead suggest concrete changes. And please do so after reading the linked neovim issue. |
Stdin heuristic detection is complicated and opaque enough that it's worth having easy access to the complete story that leads ripgrep to decide whether to search stdin or not. Ref #2806
First of, thanks for engaging on this. I appreciate it. I apologize if I came off as curt, it was not my intention. Replying: On Neovim. I read the Neovim issue, and see that they added a stdin closing option on the job API for that. I think I experienced a similar issue on another CLI that had to do with Neovim. The issue in that case was that the application did not expect to be passed a socket pair (libuv which Neovim uses creates subprocess connecting pipes with socketpair(2), not pipe(2)). AFAIK using socket pairs for stdio is the default on some OSes (the BSDs? I can't recall). Back then I investigated whether we couldn't just switch to pipe(2) which would have made the application I was dealing with work. I forgot why, but eventually I decided this had a low chance of success, libuv switched to socketpair(2) for a reason, and if I recall correctly the commit message isn't very clear about it, so chesterons fence applies. Anyway, back to the issue: perhaps we should ask: what is the use of passing an empty file? Can ripgrep ever do anything useful if it gets passed an empty file? If the answer is no, perhaps it could be considered to just act as if stdin is not connected if reading from it succeeds but produces 0 bytes. I also changed the issue title to s/crash/exit/, as it's clear now that this behaviour is intentional. |
There is no way that will ever happen. Passing an empty file isn't necessarily something someone does intentionally. An empty file is still a valid haystack to search. Doing I appreciate that you're trying to make a concrete suggestion here, but I'm pretty sure there is nothing to be done here because the user's intent cannot be discerned. It is a failing of the Unix CLI conventions (of which, stdin detection is part of it). You have two choices. First is to make the calling program stop advertising stdin as a readable file descriptor. Second is to disable ripgrep's heuristic stdin detection by passing The only real way to avoid something like this is to build a program that doesn't change its behavior based on what stdin is. But then common usage scenarios get more verbose. ripgrep does the intuitive thing correctly 99% of the time in exchange for doing the unintuitive thing with a bad failure mode in certain cases. And usually that's because of the parent process doing something they probably shouldn't be doing... Like advertising a readable stdin. |
You're right, I hadn't thought of that use case, mainly because I tend to use normal grep for this. I reach for ripgrep when I need to sensibly search "everything". It doesn't make sense to break existing users of this, even if it were somehow decided that ignoring empty files were reasonable.
Hopefully I can get around to writing this up for the GNU parallel author. But something tells me that changing its behaviour to supplying a closed stdin would trigger other types of unpalatable failures in other programs. Can't please everyone. Thanks for your thoughts. |
Yeah, I very specifically designed ripgrep to be able to drop-in for grep in most cases. Obviously it has some different behaviors and flags, but in most cases, you should be able to replace
I think this is likely, yes, although I don't know what it is. If you do go through with this and get a concrete answer to this, I'd be most appreciative if you could share it here. Because it would complete the "nothing can be done" picture here. neovim, for example, I believe added an option to not pass a readable stdin, but I don't believe it changed its default behavior. So there is likely something here. Whether it's just another form of legacy or something more "legitimate" though, I dunno. |
EDIT: wrong |
What? No. Just tell ripgrep what to search and it will behave like grep. The only reason any of this is an issue is because ripgrep tries to guess the user's intent when a file or directory are not specified. But you can remove ripgrep's guess work by treating it like grep and providing an explicit file or directory to search. For example, |
whoops, sorry, I was confused. I was sillily using f=$(cat files.txt); parallel rg -io {1} $f :::: strings.txt In a codebase, I wanted to find which files queried a database table, and whether or not they mentioned any fields from a list of fields to be deprecated from the table. time ( parallel rg -io {2} {1} :::: <(rg -il 'SCHEMA\.TABLE\w+' .) dep-fields.txt )
time ( f=$(rg -il 'SCHEMA\.TABLE\w+' .); parallel rg -io {1} $f :::: dep-fields.txt )
time ( parallel grep -iHno {2} {1} :::: <(grep -irl -E 'SCHEMA\.TABLE\w+' .) dep-fields.txt )
time ( f=$(grep -irl -E 'SCHEMA\.TABLE\w+' .); parallel grep -iHno {1} $f :::: dep-fields.txt )
|
Yeah you got it! |
Please tick this box to confirm you have reviewed the above.
What version of ripgrep are you using?
How did you install ripgrep?
I ran
What operating system are you using ripgrep on?
Debian Testing
Describe your bug.
When run inside of GNU parallel, ripgrep produces nothing. Running the same command outside of parallel (copying the output of its
-v
does work).Passing the
--tty
flag to GNU parallel avoids the issue. I'm reasonably certain this used to workWhat are the steps to reproduce the behavior?
What is the actual behavior?
See above, output is included.
I expected the same output in parallel as without parallel. AFAIK this used to work. I'm unsure when it broke. The problem appears to be stdin detection. Turning it off with the
./
trick (appending to the end) makes things work. An strace of a bad run with-j 1
:This is what the last part looks like in a "good" run (outside of parallel):
What is the expected behavior?
I expected ripgrep to just ignore stdin, continue functioning, and print to stdout. As far as I can recall, this used to work (ripgrep inside parallel without any changes). It's possible that parallel changed its implementation, perhaps it used to have a dummy stdin connected. Still, I believe it should work. The workaround of adding
./
as the last argument does work, but the silence of the error makes it difficult to track down what's the problem. Ideally it should work automatically.The text was updated successfully, but these errors were encountered: