Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix tar xf with symlinks #54

Merged
merged 1 commit into from
Jul 5, 2023

Conversation

dscho
Copy link
Member

@dscho dscho commented Jun 27, 2023

When tar xf encounters a symbolic link, it seems to call fstatat() with a file descriptor referring to said symbolic link and a pathname that is empty. Apparently the intended behavior is to then operate on the symbolic link itself (i.e. not on its target, unlike fstat() would). This behavior was broken in Cygwin v3.4.7, and here is a fix for that bug.

In 4b82229 (Cygwin: fix errno values set by readlinkat, 2023-04-18)
the code of `readlinkat()` was adjusted to align the `errno` with Linux'
behavior.

To accommodate for that, the `gen_full_path_at()` function was modified,
and the caller was adjusted to expect either `ENOENT` or `ENOTDIR` in
the case of an empty `pathname`, not just `ENOENT`.

However, `readlinkat()` is not the only caller of that helper function.

And while most other callers simply propagate the `errno` produced by
`gen_full_path_at()`, two other callers also want to special-case empty
`pathnames` much like `readlinkat()`: `fchmodat()` and `fstatat()`.

Therefore, these two callers need to be changed to expect `ENOTDIR` in
case of an empty `pathname`, too.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
@dscho dscho requested a review from vdye June 27, 2023 20:37
@dscho dscho self-assigned this Jun 27, 2023
@dscho
Copy link
Member Author

dscho commented Jun 27, 2023

I also sent this patch to the Cygwin project, let's see what they say: https://inbox.sourceware.org/cygwin-patches/c985ab15b28da4fe6f28da4e20236bc0feb484bd.1687898935.git.johannes.schindelin@gmx.de/T/#u.

@dscho dscho marked this pull request as ready for review June 27, 2023 20:53
@@ -4625,7 +4625,7 @@ fstatat (int dirfd, const char *__restrict pathname, struct stat *__restrict st,
int res = gen_full_path_at (path, dirfd, pathname);
if (res)
{
if (!(errno == ENOENT && (flags & AT_EMPTY_PATH)))
if (!((errno == ENOENT || errno == ENOTDIR) && (flags & AT_EMPTY_PATH)))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per the fstatat() man page, the error returned by the function if pathname is empty should be ENOENT:

ENOENT A component of path does not name an existing file or path is an empty string.

fchownat() is similarly documented:

ENOENT A component of path does not name an existing file or path is an empty string.

So, I think it's incorrect that gen_full_path_at() is returning ENOTDIR for an empty pathname, and the fix would be to instead check for an empty pathname right at the start of gen_full_path_at(), then return ENOENT accordingly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, but there's also this:

AT_EMPTY_PATH (since Linux 2.6.39):
If pathname is an empty string, operate on the file referred to by dirfd (which may have been obtained using the open(2) O_PATH flag). In this case, dirfd can refer to any type of file, not just a directory, and the behavior of fstatat() is similar to that of fstat(). If dirfd is AT_FDCWD, the call operates on the current working directory. This flag is Linux-specific; define _GNU_SOURCE to obtain its definition.

And indeed, the following program will not fail when run in a git/git checkout but print "Got mode 0120777" instead:

#define _GNU_SOURCE 1
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
#include <fcntl.h>
#include <errno.h>
#include <string.h>
#include <stdio.h>

int main(int argc, char **argv)
{
	int fd = open("RelNotes", O_PATH | O_NOFOLLOW);
	struct stat st;

	if (fd < 0)
		fprintf(stderr, "Could not open `RelNotes`\n");
	else if (fstatat(fd, "", &st, AT_EMPTY_PATH) < 0)
		fprintf(stderr, "fstatat() failed with %d (%s)\n", (int)errno, strerror(errno));
	else
		printf("Got mode 0%o\n", (int)st.st_mode);
	
	return 0;
}

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right - in that case, your fix behaves correctly. My point is that, if AT_EMPTY_PATH is not set, fstatat() is currently behaving incorrectly (since it returns ENOTDIR, rather than ENOENT, for an empty path).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if AT_EMPTY_PATH is not set, fstatat() is currently behaving incorrectly (since it returns ENOTDIR, rather than ENOENT, for an empty path

I think that is intended, as the dirfd should be used as directory unless AT_EMPTY_PATH is specified, if I read the spec correctly.

Copy link

@vdye vdye Jul 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICT, this issue is that there's no commentary in the documentation indicating what happens when both "path is empty" (where you'd return ENOENT) and "dirfd doesn't point to a directory" (where you'd return ENOTDIR) are satisfied, assuming no AT_EMPTY_PATH. Either could feasibly be correct, but the upstream change you're responding to (possibly inadvertently) changed the behavior from the former to the latter.

The reason I lean towards "ENOENT when path is empty" superceding "ENOTDIR when dirfd is not a directory" is because returning ENOENT aligns with MSYS2's historical behavior before the recent upstream change and having gen_full_path_at() return ENOENT for an empty path avoids the need to change its callers as you have here.

One way to gain more confidence in one way or the other might be to reference other implementations of fstatat(). I'm not sure where to look, though; I think I found an example in the Linux kernel that checks filename before it does anything with dfd (returning ENOENT if filename is empty & AT_EMPTY_PATH is unset), but I also don't know enough about the Linux kernel to say whether that's a relevant comparison.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hear you, and agree with you that it would probably make more sense to return ENOENT in that case. I'm just out of energy to fight that particular battle on the Cygwin-patches mailing list...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dscho
Copy link
Member Author

dscho commented Jul 5, 2023

I'll go ahead and merge this so that we have a fixed MSYS2 runtime. In the unlikely event that Cygwin decides to go with a different solution, we can always adapt.

@dscho dscho merged commit ea78182 into git-for-windows:main Jul 5, 2023
@dscho dscho deleted the fix-tar-xf-with-symlinks branch July 5, 2023 11:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants