-
Notifications
You must be signed in to change notification settings - Fork 257
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support wildcards in reverse mode --exclude #367
Conversation
"Store exclusion patterns as plaintext" is done.
Edit: Never mind, that test needs to be changed when using the gitignore matcher. |
Thanks for working on this! gitignore format seems like a good choice. The only thing I'm a little concerned about is that this could break existing users of "--exclude". |
There is indeed a change in functionality, previously '--exclude some_file.txt' only excluded some_file.txt in the root of the filesystem, now it will exclude a file with that name anywhere. However, by adding a '/' before the paths should work as before: '--exclude /some_file.txt' only excludes some_file.txt in the root. So it should be a simple migration. |
I have removed the duplicate calls to If you think this is really bad, feel free to skip this commit. The support for --exclude-from should be independent of this change (as a matter of fact, should be independent of all the other changes.) |
I think we should add a new flag, |
It can be done, but my plan is also to add an option to read from a file, and the user would be able to combine this new option with the existing ones is he wants to. It might be confusing if --exclude-from supports one syntax and the command line ones support two. There could be two --exclude-from variants, but then the number of options might get out of hand :-) And But anyway, in the end it's your call. If you still prefer to keep |
The Idea:
|
Since exclusions will support wildcards, it's not possible to store the list of exclusions as a list of encrypted paths (calculated during initialization). Instead, the plain paths are stored, and the path to be checked is decrypted at the time of isExcluded check and compared with the stored plaintext paths.
isExcluded needs to decryptPath, so it was made to return the results of this function to that the Fuse callbacks don't need to call decryptPath again.
-exclude keeps the original behaviour by matching from the root of the mounted filesystem. The new option -exclude-wildcard (or -ew) matches anywhere (unless tha pattern begins with '/').
It allows the user to specify a file containg exclusion patters. The patterns are "wildcard" ones, that is, matching anywhere.
I believe the implementation is ready. |
// from decryptPath for convenience. | ||
func (rfs *ReverseFS) isExcludedCipher(relPath string) (bool, string, error) { | ||
if rfs.isTranslatedConfig(relPath) || rfs.isDirIV(relPath) { | ||
return false, "", nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since isDirIV
only checks the base name and not the full path, an attacker might be able to check for the existence of a directory by accessing DIRNAME/gocryptfs.diriv
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assume the user has specified that directory DIRNAME
should be excluded. With the old exclusion logic, all files and subdirectories (e.g., DIRNAME/gocryptfs.diriv
) were excluded as well.
With the new code, isExcludedCipher("DIRNAME/gocryptfs.diriv")
will return excluded=false
since the check is skipped for gocryptfs.diriv
files (see implementation of isDirIV
). I didn't find any easy way to exploit that, but the new behavior does not seem correct to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldn't exploit it either. Putting some debugging calls, I see that if I run stat pHIb3vvfW_V4E9G1svEJlw/gocryptfs.diriv
where pHIb3vvfW_V4E9G1svEJlw
is the encrypted name of an excluded directory, I see that GetAttr
(and then isExcludedCiper
) are called first for the parent directory (which returns false).
If there are several directories in the path, GetAttr
is called sequentially for each directory. I'm not sure where this behavior comes from.
That said, there might be a way to skip this parent directory checking. So I suppose that in isExcludedCipher
, if the file is a special file (DirIV or NameFile) I should check if the containing directory is excluded. For TranslatedConfig I think this is unnecessary as it only appears in the root directory, and for regular files the actual pattern matching handles the exclusion correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure where this behavior comes from.
The kernel is responsible for the GetAttr
call, however, I think its a bad idea to rely on it as a security feature. There might be ways to prevent it from happening (the kernel caches such information under specific circumstances, and does not issue another request before some time has passed), and it also wouldn't be called when gocryptfs is embedded into a third party app that uses the FS interface directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For DirIV files, I've added a check whether the parent directory is excluded. For this case, the previous behavior should be back.
internal/fusefrontend_reverse/rfs.go
Outdated
@@ -304,10 +266,10 @@ func (rfs *ReverseFS) openDirPlaintextnames(relPath string, entries []fuse.DirEn | |||
|
|||
// OpenDir - FUSE readdir call | |||
func (rfs *ReverseFS) OpenDir(cipherPath string, context *fuse.Context) ([]fuse.DirEntry, fuse.Status) { | |||
if rfs.isExcluded(cipherPath) { | |||
excluded, relPath, err := rfs.isExcludedCipher(cipherPath) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With this patch applied, special paths (translated configs, dir IVs and name files) will be translated to incorrect paths. Opening any of the special paths should fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To provide an example what could be going wrong here, assume cipherPath
ends with .name
. In practice this shouldn't happen because name-files are regular files and not directories, and FUSE will check the file type before calling any operations - however, for now assume that the user can somehow trick FUSE into issuing such an operation. With the old code decrypting a filename would have failed.
With the updated code, isExcludedCipher
will silently remove the .name
-extension, and behave like the user opened some different path. A FUSE request to open the translated config or a gocryptfs.diriv
file will now also behave different, and try to open the root Cipherdir
(since isExcludedCipher
will translate those paths to relPath = ""
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems even harder to exploit, but I believe that the check if the parent directory is excluded should also prevent this case and also the Readlink
one below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think checking the parent directory is sufficient here. Opening a .name
file with OpenDir
should never work, no matter if the parent directory is excluded or not.
Previously rfs.decryptPath
failed for such paths. Maybe we should now add explicit checks and disallow operations that do not make any sense? Maybe isExcludedCipher
could even return the file type (regular file, name file, diriv file, ...)? @rfjakob: Any opinions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it's not necessary to return the file type; just a boolean indicating if it's a virtual file should be enough to add a check in OpenDir
and Readlink
for those kinds of files and stop access to them. Or do you see the need for different behavior for each kind of special file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For OpenDir
and Readlink
it would be sufficient, but in some functions the file type is required anyway - see, e.g., the Open()
FUSE call. I am not sure what a suitable design would be, though.
internal/fusefrontend_reverse/rfs.go
Outdated
return entries | ||
} | ||
filtered = make([]fuse.DirEntry, 0, len(entries)) | ||
for _, entry := range entries { | ||
// filepath.Join handles the case of cipherPath="" correctly: | ||
// Join("", "foo") -> "foo". This does not: cipherPath + "/" + name" | ||
p := filepath.Join(cDir, entry.Name) | ||
if rfs.isExcluded(p) { | ||
if excluded, _, _ := rfs.isExcludedCipher(p); excluded { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit inefficient: The paths are first encrypted and then decrypted just to check if they are excluded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be fixed.
internal/fusefrontend_reverse/rfs.go
Outdated
@@ -417,10 +380,14 @@ func (rfs *ReverseFS) StatFs(relPath string) *fuse.StatfsOut { | |||
|
|||
// Readlink - FUSE call | |||
func (rfs *ReverseFS) Readlink(relPath string, context *fuse.Context) (string, fuse.Status) { | |||
if rfs.isExcluded(relPath) { | |||
excluded, pPath, err := rfs.isExcludedCipher(relPath) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The behavior here is also changed for special files (translated config, dirIV and name files).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is pretty much the same issue as above. Previously decryptPath
(in openBackingDir
) would have failed for files ending with .name
or other special files (e.g., gocryptfs.diriv
). Such files should never be symlinks.
With the new code, however, the .name
extension is silently dropped. If the user would somehow trick FUSE into issuing a Readlink
operation on a name file, it would now return the result as if the command would have been applied on the file without the .name
extension.
Files are excluded before their names are encrypted, to avoid having the names decrypted again when checking for exclusions.
As for the other items in the review, I don't quite understand the problems. Could you perhaps describe them in some more detail, if possible showing an example? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added some more detailed explanations, please let me know if there is still anything unclear.
// from decryptPath for convenience. | ||
func (rfs *ReverseFS) isExcludedCipher(relPath string) (bool, string, error) { | ||
if rfs.isTranslatedConfig(relPath) || rfs.isDirIV(relPath) { | ||
return false, "", nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assume the user has specified that directory DIRNAME
should be excluded. With the old exclusion logic, all files and subdirectories (e.g., DIRNAME/gocryptfs.diriv
) were excluded as well.
With the new code, isExcludedCipher("DIRNAME/gocryptfs.diriv")
will return excluded=false
since the check is skipped for gocryptfs.diriv
files (see implementation of isDirIV
). I didn't find any easy way to exploit that, but the new behavior does not seem correct to me.
internal/fusefrontend_reverse/rfs.go
Outdated
@@ -304,10 +266,10 @@ func (rfs *ReverseFS) openDirPlaintextnames(relPath string, entries []fuse.DirEn | |||
|
|||
// OpenDir - FUSE readdir call | |||
func (rfs *ReverseFS) OpenDir(cipherPath string, context *fuse.Context) ([]fuse.DirEntry, fuse.Status) { | |||
if rfs.isExcluded(cipherPath) { | |||
excluded, relPath, err := rfs.isExcludedCipher(cipherPath) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To provide an example what could be going wrong here, assume cipherPath
ends with .name
. In practice this shouldn't happen because name-files are regular files and not directories, and FUSE will check the file type before calling any operations - however, for now assume that the user can somehow trick FUSE into issuing such an operation. With the old code decrypting a filename would have failed.
With the updated code, isExcludedCipher
will silently remove the .name
-extension, and behave like the user opened some different path. A FUSE request to open the translated config or a gocryptfs.diriv
file will now also behave different, and try to open the root Cipherdir
(since isExcludedCipher
will translate those paths to relPath = ""
).
internal/fusefrontend_reverse/rfs.go
Outdated
@@ -417,10 +380,14 @@ func (rfs *ReverseFS) StatFs(relPath string) *fuse.StatfsOut { | |||
|
|||
// Readlink - FUSE call | |||
func (rfs *ReverseFS) Readlink(relPath string, context *fuse.Context) (string, fuse.Status) { | |||
if rfs.isExcluded(relPath) { | |||
excluded, pPath, err := rfs.isExcludedCipher(relPath) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is pretty much the same issue as above. Previously decryptPath
(in openBackingDir
) would have failed for files ending with .name
or other special files (e.g., gocryptfs.diriv
). Such files should never be symlinks.
With the new code, however, the .name
extension is silently dropped. If the user would somehow trick FUSE into issuing a Readlink
operation on a name file, it would now return the result as if the command would have been applied on the file without the .name
extension.
This was done to allow it to be replace with a mock object in tests (specifically in excluder_test.go).
Assuming a directory SOMEDIR (possibly deep in the filesystem hierarchy) is excluded, now SOMEDIR/gocryptfs.diriv is considered excluded. If it were not excluded, it might be possible to infer the existence of SOMEDIR by checking for SOMEDIR/gocryptfs.diriv. In practice I've not been able to actually exploit this, as fuse or the kernel tries to open each directory in a path before actually trying to open (or stat) a file, but it's safer if gocryptfs makes a further check.
I think adding check in But I didn't really understood @slackner's last comment about All that said, this is my first time working with Fuse, so I'm sure people with more experience can see some things I might be missing. |
@slackner, was your idea of returning the file type from |
Unfortunately, I don't think all of the issues are resolved yet. From my point of view, the main issue is still that special files (dirIV files, name files, ...) are not handled correctly in all FUSE calls (e.g.,
I suggested this for two reasons. The first one is to avoid the duplicate checks. Even more important, however, is that we explicitly think about which file types are allowed in which FUSE call. Returning the file type from |
Finally, As for returning the type in So there might be a performance gain. But perhaps this gain is negligible in comparison to the time necessary to do disk I/O to read the file or its metadata. Do any of you have any idea if avoiding the duplicate file type check would actually be helpful? One disadvantage of doing that is that |
I would expect that the performance gain is negligible, however, having checks at two different places seems a bit error-prone to me. Assume we introduce new file types at some point, then a check like I agree that we should probably try to come up with a better name when the function includes even more logic. |
Looks pretty solid to me now. As a merge preparation, I have rebased to master and squashed: https://github.com/rfjakob/gocryptfs/commits/better-exclude |
Merged as 3bc100a , thanks! |
This adds support for gitignore-like wildcards and exclude patters in reverse mode. It (somewhat) fixes #273: no regexp support, but the syntax should be powerful enough to satisfy most needs.
Also, since adding a lot of --exclude options can be tedious, it adds the --exclude-from option to read patterns from a file (or files).
Here are the steps that I've mapped:
Since there will be wildcards, it's not possible anymore to save a list of encrypted paths (calculated during initialization) and match against this list. Instead, in
isExcluded
the path is decrypted and matched against the plaintext excluded paths.decryptPath
In
Open
, for example, the path is decrypted in theisExcluded
check, and then whennewFile
is called to read the real file, the path is decrypted again. It might be possible to optimize the code to avoid decrypting it twice in this and the other Fuse callbacks. AlsonewFile
callsisExcluded
as a safeguard, generating yet another decryption.I'm thiking of using https://github.com/sabhiram/go-gitignore to do the checking.