Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

etcd: renameat syscall returns EACCES #2243

Closed
philips opened this issue Jun 22, 2017 · 12 comments
Closed

etcd: renameat syscall returns EACCES #2243

philips opened this issue Jun 22, 2017 · 12 comments

Comments

@philips
Copy link

philips commented Jun 22, 2017

  • Your Windows build number: Microsoft Windows [Version 10.0.15063]

  • What you're doing and what's happening: (Copy&paste specific commands and their output, or include screen shots)

etcd is a distributed database that is written in Go and used in a large number of different systems.

wget https://github.com/coreos/etcd/releases/download/v3.2.0/etcd-v3.2.0-linux-amd64.tar.gz
tar xzvf etcd-v3.2.0-linux-amd64.tar.gz
./etcd-v3.2.0-linux-amd64/etcd
2017-06-21 23:57:07.224501 C | etcdserver: create wal error: rename default.etcd/member/wal.tmp default.etcd/member/wal: permission denied
  • What's wrong / what should be happening instead:

etcd should run.

  • Strace of the failing command, if applicable:
unlinkat(AT_FDCWD, "default.etcd/member/wal", 0) = -1 ENOENT (No such file or directory)
unlinkat(AT_FDCWD, "default.etcd/member/wal", AT_REMOVEDIR) = -1 ENOENT (No such file or directory)
lstat("default.etcd/member/wal", 0xc4202449f8) = -1 ENOENT (No such file or directory)
renameat(AT_FDCWD, "default.etcd/member/wal.tmp", AT_FDCWD, "default.etcd/member/wal") = -1 EACCES (Permission denied)

See our contributing instructions for assistance.

@fpqc
Copy link

fpqc commented Jun 22, 2017

You're doing this as root? I mean, one way this might happen on ordinary Linux is if you don't chown the extracted contents of the tarball after extracting as root.

@philips
Copy link
Author

philips commented Jun 22, 2017

@fpqc these directories are net new and owned by the right user. This binary works fine on Linux.

@fpqc
Copy link

fpqc commented Jun 22, 2017

@philips Wow, that's weird. Was able to reproduce just now. Good catch!

@therealkenc
Copy link
Collaborator

therealkenc commented Jun 22, 2017

Need to see more of the strace leading up to the cited unlinkat, but this is probably a manifestation of the same problem as #1529. The EACCES isn't related to permissions; root won't save you.

@therealkenc
Copy link
Collaborator

therealkenc commented Jun 22, 2017

Okay so it's probably a stretch to lump this into #1529, although pinned handles is still the root cause.

openat(AT_FDCWD, "default.etcd/member/wal.tmp/0000000000000000-0000000000000000.wal", 
    O_WRONLY|O_CREAT|O_CLOEXEC, 0600) = 7
flock(7, LOCK_EX)                       = 0
[...]
unlinkat(AT_FDCWD, "default.etcd/member/wal", 0) = -1 ENOENT (No such file or directory)
unlinkat(AT_FDCWD, "default.etcd/member/wal", AT_REMOVEDIR) = -1 
    ENOENT (No such file or directory)
lstat("default.etcd/member/wal", 0xc42021f898) = -1 ENOENT (No such file or directory)
renameat(AT_FDCWD, "default.etcd/member/wal.tmp", AT_FDCWD, "default.etcd/member/wal") = -1
    EACCES (Permission denied)

File descriptor 7 is still open, pinning /default.etcd/member/wal.tmp, because NTFS is like that.

+1 for linking an strace gist (call it a rare occurrence around here).

@philips
Copy link
Author

philips commented Jun 22, 2017

:(

Any estimate on fix ETA? Any potential workarounds?

@fpqc
Copy link

fpqc commented Jun 22, 2017

@philips We don't usually get ETAs, but if a dev comments on your post, he'll either say something like "we're gonna put it on the backlog" or "I'm going to do an investigation and see what's going on here". If it's the former, it could take a while. If it's the latter, and if the investigation shows that it's a simple bug, then it could potentially make it out in the next insider build. Gonna have to wait for a comment from one of the developers.

@thorsteneb
Copy link

Tested on 16257 and issue is present there.

@redbaron
Copy link

redbaron commented Oct 8, 2017

Still present in Windows 16299.15 :(

@SvenGroot , @benhillis , last time you were very helpful fixing getdents64 problem, is there any chance this renameat + lock problem can be at least recognized and marked as bug?

I guess there are quite a few datastore engines which do renaming of flocked files, etcd being one of them.

@therealkenc
Copy link
Collaborator

I guess there are quite a few datastore engines which do renaming of flocked files

Fundamentally it isn't the flock(). It's the open().

Ben by chance commented on this a few days ago in #2477 (message) (he meant #640).

I tracked this down to a directory rename failing due something that has a handle open to one of the directory's children. Unfortunately this is an NTFS limitation that I can't think of a workaround without NTFS supporting posix-style rename. I know they were looking at that at some point, but I'm not sure of the status.

Which is all a variation on #1529. It is not in the same class as the getdents64() problem. I think it can theoretically be fixed without the NTFS people's help with a really big hammer. But not worth the effort. Better to get the underlying support on the NT side.

@redbaron
Copy link

Which issue this one is a duplicate of?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants