-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
find a way to make psub --fifo
safe from deadlock
#1040
Comments
Nice diagnosis and test case! |
Fix for CVE-2014-2906. Closes a race condition in funced which would allow execution of arbitrary code; closes a race condition in psub which would allow alternation of the data stream. Note that `psub -f` does not work (#1040); a fix should be committed separately for ease of maintenance. Closes #1437
Fix for CVE-2014-2906. Closes a race condition in funced which would allow execution of arbitrary code; closes a race condition in psub which would allow alternation of the data stream. Note that `psub -f` does not work (#1040); a fix should be committed separately for ease of maintenance.
Fix for CVE-2014-2906. Closes a race condition in funced which would allow execution of arbitrary code; closes a race condition in psub which would allow alternation of the data stream. Note that `psub -f` does not work (#1040); a fix should be committed separately for ease of maintenance.
I've pushed a fix to a topic branch (19217f3 on psub_fix), but I don't want to merge to master until the buffering issue is worked out. |
Any progress on this? |
That change was a bit too eager as the mkfifo route doesn't currently work. See fish-shell#1040 and fish-shell#2052. This reverts commit a17b9fd.
@faho Yes. And no. But, yes. But before we get to the good stuff, lets review:
command <(process) ... The standard output of and process >(command) ... The standard input of This syntax is by no means standard, and it certainly is not POSIX. It is supported by The command (process | psub) (purportedly) capitalizes on the fact that no special syntax is needed to perform "process substitution"; the ordinary syntax of command substitution (
Basically, I propose we use a ramdisk. It combines to the "durability", ability to handle large files and non-streamable data of "file substitution" (?) (zsh @ridiculousfish @zanchey @anyone @anyone @bueller... Initial thoughts? function psub --description "Process substitution, revisited."
set -l filename
set -l funcname
set -l halfmem
set -l sectors
set -l ramdisk
set -l mountpoint
set -l psubdir
set -l use_file 0
while set -q argv[1]
switch $argv[1]
case -h --help
__fish_print_help psub
return 0
case -f --file
set use_file 1
end
set -e argv[1]
end
if not status --is-command-substitution
echo psub: Not inside of command substitution >&2
return 1
end
if test -z "$TMPDIR"
set TMPDIR /tmp
end
# Implemention for other systems is left as an exercise for the reader.
if test (uname) != Darwin
set use_file 1 # ... mount -t tmpfs ...
end
if test $use_file -eq 1
while not set psubdir (mktemp -d $TMPDIR/.psub.XXXXXXXXXX); end
chmod 0300 $psubdir
while not set filename (mktemp $psubdir/temp.XXXXXXXXXX); end
chmod 0100 $psubdir
chmod 0200 $filename
else
# Basically: detect memory and use 1/2 of it (the default with tmpfs
# on other platforms) as a ramdisk. The memory is allocated as needed.
# `hdid -nomount ram://SECTORS` (of 512 bytes) on Darwin.
set halfmem (math (sysctl -n hw.memsize) / 2)
set sectors (math $halfmem / 512)
set ramdisk (hdid -nomount ram://$sectors | tr -d [:space:])
chmod 0600 $ramdisk
# $ramdisk is now something like '/dev/disk2'; it would be nice if we
# could just use the raw device file as $filename, but if we do that
# there's no EOF. So we format, mount, and use a tempfile on our
# ramdisk. UDF or is probably as good as anything. We probably just
# don't want any filesystem that's journaled to reduce overhead.
while not set mountpoint (mktemp -d /$TMPDIR/.psub.XXXXXXXXXX); end
chmod 0300 $mountpoint
newfs_udf $ramdisk >/dev/null 2>&1
mount_udf -o nobrowse $ramdisk $mountpoint
while not set psubdir (mktemp -d $mountpoint/.psub.XXXXXXXXXX); end
chmod 0300 $psubdir
while not set filename (mktemp $psubdir/temp.XXXXXXXXXX); end
chmod 0100 $mountpoint
chmod 0100 $psubdir
chmod 0200 $filename
end
# Write stdin to tempfile
cat > $filename
chmod 0400 $filename
# Write filename to stdout
echo $filename
# Find unique function name
while true
set funcname __fish_psub_(random)
if not functions $funcname >/dev/null 2>&1
break
end
end
# Make sure we unmount and detatch when caller exits.
function $funcname --on-job-exit caller --inherit-variable filename --inherit-variable funcname --inherit-variable use_file --inherit-variable ramdisk --inherit-variable mountpoint --inherit-variable psubdir
chmod 0700 $psubdir $filename
if test $use_file -eq 0
umount -f $mountpoint >/dev/null 2>&1
which hdiutil >/dev/null 2>&1
and hdiutil detach $ramdisk >/dev/null 2>&1
chmod 0700 $mountpoint
end
command rm -rf $mountpoint $psubdir $filename
functions -e $funcname
end
end |
I don't think so - it could be improved but I don't see much that is wrong. A bit imprecise and awkward maybe but assuming psub were bug-free it would mostly be sort-of correct. Anyway, this isn't really important to the matter at hand - we should improve psub documentation, but mostly we should just improve psub.
Or try to
Nope - "mount: only root can use "--types" option". (There is probably a way, but that one's not it)
This doesn't seem great - you're creating a ramdisk with one tool and then only detach it if another tool exists? Anyway, I don't see the merits of this approach for my system - /tmp (where current psub stores its fifo or file) is already a tmpfs. As my cursory googling shows, it's the default on archlinux (my distro), Fedora, Plus, fifos have an advantage that you don't have here - they can be filled in the background, which means the reading side can start earlier (which is presumably why zsh offers both fifos and files). Check the current psub source (try not to step on the bugs) - the fifo path does so, while the file path does not. Also, this increases the code complexity, especially the amount of OS-specific code. Can't say I'm a fan. |
It is important if you one wants to know what we're trying to accomplish with this. Line-by-line:
Every sentence.
You're quoting right where I dropped off there, so I'm not sure we actually disagree here. But my point here is threefold:
No, while I probably don't need this guard line any more as I've since wrapped it in an ~> asdfasdf >/dev/null 2>&1
fish: Unknown command 'asdfasdf'
Perhaps you missed my joke:
That is to say, it's likely already implemented. As in, when your init scripts ran
There's nothing preventing one from reading a regular file while its still being written.
There is ample precedent for this. There is an immense ammount of OS-specific code in fish.
Well... sorry, I guess? We can't all run Arch Linux. And I must say, pretty rude IMO, considering I only did any of this in light of the fact that you specifically asked for "progress" on this issue. I guess I interpreted that to mean more meaningful/fundamental improvements, since the forking problem is a much larger issue, well beyond the scope of this here.. If all you're looking for is a once-off workaround for 'THIS FISH DON'T FORK!', and you just want |
Granted, but minor.
Sure that zsh does it that way?
"assuming psub were bug-free".
We don't - I was expanding on your point.
Ah okay. In that case, shouldn't you do that with anything? Or isn't the error output here kinda important? You're leaking tmpfss (if I understand correctly).
I was trying to say that it might be more complicated for other systems, though now I see that we probably could use /run on linux and just do the ramdisk setup on OSX/BSD.
Maybe we should consider running the
This seems a bit more complicated than most OS-specific paths.
For the record, I'm a bit annoyed by those, mostly since most of them are only used by the
If I came of as rude, I'm sorry about that. It was never my intention. I was merely trying to express my technical opinion of your code. Maybe I was too blunt - might be my inherent german-ness (germanity?) or my mastery of the english language. Anyway, I appreciate your willingness to help here, I just don't agree with your proposal.
I was more asking about @zanchey's topic branch and the work on the buffering issue. The buffering also bites us in other respects - look for bugs about functions running in the background, so it should be fixed anyway, which would also fix psub (well, that and the missing "$"). In that light, your ramdisk idea comes across as optimization work, and for that I didn't like the added complexity - the added forks (via e.g. |
Apology accepted.
mpb% echo <(echo) <(echo)
/dev/fd/11 /dev/fd/12
No, the error isn't important. And no, we're not leaking tmpfs's. Mac OS X (Darwin, technically) has a rather bizarre mechanism for userland disks.
Well, here you run essentially the opposite risk of using a buffer; if the reading process consumes at a faster rate than the outputting process it will "starve" and terminate.
So, I think you might be conflating issues with pipe buffer(s) with the issue of "to fork or not to fork" within a pipeline (ephemeral file descriptors). Pipe buffers are handled in-kernel, and are particularly relevant to FIFOs and real file descriptors. There is a separate issue, of where, when, how, and with what, to fork and/or create a new thread, within a pipelined chain of commands. See the lengthy discussion in #1228; in that vein, I still think |
Ummh...the error you'd be suppressing is "Unknown command 'hdiutil'" - this is about the
In that case are you leaking entries in /dev or are they reused?
So there is something stopping us from backgrounding writing to a regular file.
Currently, fish suffers worse, because it actually hangs. (Which IIUC is because of #238 - we never get to the reading before finishing the writing so if we can't finish the writing because the buffer is full...) Okay, let's look at the ramdisk stuff again: The setup on OSX is really rather complicated, while on linux we could use /run (with a fallback to /tmp). The advantage of this approach is that the data never hits the disk (unless of course it swaps) but is still seekable. The disadvantage is that even readers who can deal with waiting for data (like presumably So it is just straight up better than using on-disk files (discounting the code complexity), but not strictly better than fifos - which is how we'd again end up with offering two solutions (and letting the user decide between them since we can't, like zsh does).
I'm afraid I don't completely understand - how would that help with psub here? Wouldn't that still be tied to the buffering limitations?
Be careful what you wish for when it comes to bash and what things it does to /dev - or you might end up implementing /dev/tcp. |
As a systems administrator, I am terrified by the use of memory-backed filesystems as written. |
The topic branch has bitrotted, and wasn't a particularly novel fix so I've removed it while the rest of the issue is worked out. |
Could we not have two commands for now? A So adding the missing sigil gives us |
@jkabrg: See #2052 - the path that would be reached when adding that "$" is basically broken, so adding the sigil again would make it worse, not better. |
It's pretty clear that this is not going to be fixed as part of the 2.3.0 release milestone so I'm punting this back to next-2.x. |
I'm removing the "next-2.x" label because this has been open for three years. There is no reason to think this will become a priority to fix anytime soon. |
Hello there, I have encountered a situation where fish hangs upon process substitution, like in |
@urxvtcd: Yes, that's this issue. The workaround is to use something that can follow multiple files at the same time (like multitail). Or use multiple terminal windows with one tail each, or use e.g. tmux. Alternatively, and I recommend against this, you can make your own fifos (with Note that, if you background the |
While fixing #4222 I updated the |
psub --fifo
safe from deadlock
Fix for CVE-2014-2906. Closes a race condition in funced which would allow execution of arbitrary code; closes a race condition in psub which would allow alternation of the data stream. Note that `psub -f` does not work (fish-shell#1040); a fix should be committed separately for ease of maintenance.
Did 6c22c88 fix this? |
I don't think |
Whoops, I think the title of that commit message was reversed. |
It did not - This sort of looks like fish is waiting for the |
It seems to me that there is a fundamental semantic difference between pipe-based process substitution (ie. echo (cat 'some_"large"_file' | psub --fifo &) That looks very scary (to me, at least), but I think #238 might actually cause this to work consistently for me so far. Another thing that works is the following (This example is contrived and admittedly completely ridiculous): echo (cat 'some_"large"_file' | psub --fifo | xargs cat) Oof that hurts, but I think this illustrates quite clearly that the problem is not with the pipe, nor with psub, but with command substitution. While it may seem similar to pipe-based process substitution on the surface, their semantics are different. So different that trying to combine them together may be hard if not impossible (or a really really dirty hack).
Given this I would prefer to have a separate construct for pipe-based process substitution, so that I am aware that what I am about to do will spawn a background job, that it will be taken care of for me by fish and that what I get back will be a (named) pipe file name that will also be taken care of for me by fish. And also that I will be properly aware of all the gotchas (Of which I am sure there are plenty). I love fish for its adherence to a simple clean syntax, but if a construct has complex semantics like pipe-based process substitution, it had better stand out. This is clearly not the case right now, given how common command substitution is. |
In case nobody has made that case yet
cannot work unless The The only way to fix is to not wait for the producer. If there's a fifo, it's meant to be an IPC system, with commands running concurrently, not one after the other. Another problem with the current approach is that it creates an extra process ( The interface would probably have to change, like in:
With which you could do away with named pipe and use the far cleaner That And you could have a You'd still need to implement a |
ksh had process substitution already before ksh86. ksh86 is when it was first documented. In that version, the reading ones
I don't know when that one was removed. |
That's not true, when zsh and Byron Rakitzis's rc introduced theirs (in 1.0 in 1990 for zsh, 0.9 in 1991 for rc, a few years before bash, but after ksh's and Tom Duff's original rc), they were using named pipes only. It's only later that they switched to unnamed pipes and /dev/fd on systems that supported them (but still fell back to named pipes on others). That was in 2.6beta17 (1996) for zsh and 1.2 for rc (later in 1991) The ability to use named pipes when /dev/fd is not available was added to ksh93 in ksh93u+ in 2011. (in any case, that's no about numbered file descriptors but about being able to refer to file descriptors with /dev/fd/x). |
@stephane-chazelas Thanks for that, I was finding that things like You could do Edit: Although I guess it's similar to the syntax for moreutils pee (which implements input process substitution) |
I see a few paths forward for this:
I don't really love any of these though. However, I think most of these could also offer a path forward for #1786 as well.
|
There are actually a couple of bugs here.
The easy one is here —
use_fifo
is missing a sigil and is therefore a string comparison, causingpsub
to always act aspsub -f
.Unfortunately, it's not as simple as fixing that typo as doing so will cause a non-interruptable hang under certain circumstances. I believe it occurs when the pipe buffer is exceeded? But I'm not sure how to actually determine the pipe buffer in fish. Maybe forking another process is needed? Or... something.
Anyway, here's a (hopefully) cross-platform test case (about 1.5MiB and requires Java, uses openssl to decode base64). It's a standalone wrapper for rhino. There's a big hunk of base64 in the middle of it, but the script is just:
java -jar (echo 'BIGHUNKOFBASE64' | openssl base64 -d | psub)
The text was updated successfully, but these errors were encountered: