Skip to content

Commit

Permalink
Merge tag 'fs.idmapped.v5.17' of git://git.kernel.org/pub/scm/linux/k…
Browse files Browse the repository at this point in the history
…ernel/git/brauner/linux

Pull fs idmapping updates from Christian Brauner:
 "This contains the work to enable the idmapping infrastructure to
  support idmapped mounts of filesystems mounted with an idmapping.

  In addition this contains various cleanups that avoid repeated
  open-coding of the same functionality and simplify the code in quite a
  few places.

  We also finish the renaming of the mapping helpers we started a few
  kernel releases back and move them to a dedicated header to not
  continue polluting the fs header needlessly with low-level idmapping
  helpers. With this series the fs header only contains idmapping
  helpers that interact with fs objects.

  Currently we only support idmapped mounts for filesystems mounted
  without an idmapping themselves. This was a conscious decision
  mentioned in multiple places (cf. [1]).

  As explained at length in [3] it is perfectly fine to extend support
  for idmapped mounts to filesystem's mounted with an idmapping should
  the need arise. The need has been there for some time now (cf. [2]).

  Before we can port any filesystem that is mountable with an idmapping
  to support idmapped mounts in the coming cycles, we need to first
  extend the mapping helpers to account for the filesystem's idmapping.
  This again, is explained at length in our documentation at [3] and
  also in the individual commit messages so here's an overview.

  Currently, the low-level mapping helpers implement the remapping
  algorithms described in [3] in a simplified manner as we could rely on
  the fact that all filesystems supporting idmapped mounts are mounted
  without an idmapping.

  In contrast, filesystems mounted with an idmapping are very likely to
  not use an identity mapping and will instead use a non-identity
  mapping. So the translation step from or into the filesystem's
  idmapping in the remapping algorithm cannot be skipped for such
  filesystems.

  Non-idmapped filesystems and filesystems not supporting idmapped
  mounts are unaffected by this change as the remapping algorithms can
  take the same shortcut as before. If the low-level helpers detect that
  they are dealing with an idmapped mount but the underlying filesystem
  is mounted without an idmapping we can rely on the previous shortcut
  and can continue to skip the translation step from or into the
  filesystem's idmapping. And of course, if the low-level helpers detect
  that they are not dealing with an idmapped mount they can simply
  return the relevant id unchanged; no remapping needs to be performed
  at all.

  These checks guarantee that only the minimal amount of work is
  performed. As before, if idmapped mounts aren't used the low-level
  helpers are idempotent and no work is performed at all"

Link: 2ca4dcc ("fs/mount_setattr: tighten permission checks") [1]
Link: containers/podman#10374 [2]
Link: Documentations/filesystems/idmappings.rst [3]
Link: a65e58e ("fs: document and rename fsid helpers") [4]

* tag 'fs.idmapped.v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
  fs: support mapped mounts of mapped filesystems
  fs: add i_user_ns() helper
  fs: port higher-level mapping helpers
  fs: remove unused low-level mapping helpers
  fs: use low-level mapping helpers
  docs: update mapping documentation
  fs: account for filesystem mappings
  fs: tweak fsuidgid_has_mapping()
  fs: move mapping helpers
  fs: add is_idmapped_mnt() helper
  • Loading branch information
torvalds committed Jan 11, 2022
2 parents 84bfcc0 + bd30336 commit 5dfbfe7
Show file tree
Hide file tree
Showing 17 changed files with 356 additions and 231 deletions.
72 changes: 0 additions & 72 deletions Documentation/filesystems/idmappings.rst
Original file line number Diff line number Diff line change
Expand Up @@ -952,75 +952,3 @@ The raw userspace id that is put on disk is ``u1000`` so when the user takes
their home directory back to their home computer where they are assigned
``u1000`` using the initial idmapping and mount the filesystem with the initial
idmapping they will see all those files owned by ``u1000``.

Shortcircuting
--------------

Currently, the implementation of idmapped mounts enforces that the filesystem
is mounted with the initial idmapping. The reason is simply that none of the
filesystems that we targeted were mountable with a non-initial idmapping. But
that might change soon enough. As we've seen above, thanks to the properties of
idmappings the translation works for both filesystems mounted with the initial
idmapping and filesystem with non-initial idmappings.

Based on this current restriction to filesystem mounted with the initial
idmapping two noticeable shortcuts have been taken:

1. We always stash a reference to the initial user namespace in ``struct
vfsmount``. Idmapped mounts are thus mounts that have a non-initial user
namespace attached to them.

In order to support idmapped mounts this needs to be changed. Instead of
stashing the initial user namespace the user namespace the filesystem was
mounted with must be stashed. An idmapped mount is then any mount that has
a different user namespace attached then the filesystem was mounted with.
This has no user-visible consequences.

2. The translation algorithms in ``mapped_fs*id()`` and ``i_*id_into_mnt()``
are simplified.

Let's consider ``mapped_fs*id()`` first. This function translates the
caller's kernel id into a kernel id in the filesystem's idmapping via
a mount's idmapping. The full algorithm is::

mapped_fsuid(kid):
/* Map the kernel id up into a userspace id in the mount's idmapping. */
from_kuid(mount-idmapping, kid) = uid

/* Map the userspace id down into a kernel id in the filesystem's idmapping. */
make_kuid(filesystem-idmapping, uid) = kuid

We know that the filesystem is always mounted with the initial idmapping as
we enforce this in ``mount_setattr()``. So this can be shortened to::

mapped_fsuid(kid):
/* Map the kernel id up into a userspace id in the mount's idmapping. */
from_kuid(mount-idmapping, kid) = uid

/* Map the userspace id down into a kernel id in the filesystem's idmapping. */
KUIDT_INIT(uid) = kuid

Similarly, for ``i_*id_into_mnt()`` which translated the filesystem's kernel
id into a mount's kernel id::

i_uid_into_mnt(kid):
/* Map the kernel id up into a userspace id in the filesystem's idmapping. */
from_kuid(filesystem-idmapping, kid) = uid

/* Map the userspace id down into a kernel id in the mounts's idmapping. */
make_kuid(mount-idmapping, uid) = kuid

Again, we know that the filesystem is always mounted with the initial
idmapping as we enforce this in ``mount_setattr()``. So this can be
shortened to::

i_uid_into_mnt(kid):
/* Map the kernel id up into a userspace id in the filesystem's idmapping. */
__kuid_val(kid) = uid

/* Map the userspace id down into a kernel id in the mounts's idmapping. */
make_kuid(mount-idmapping, uid) = kuid

Handling filesystems mounted with non-initial idmappings requires that the
translation functions be converted to their full form. They can still be
shortcircuited on non-idmapped mounts. This has no user-visible consequences.
2 changes: 1 addition & 1 deletion fs/cachefiles/bind.c
Original file line number Diff line number Diff line change
Expand Up @@ -117,7 +117,7 @@ static int cachefiles_daemon_add_cache(struct cachefiles_cache *cache)
root = path.dentry;

ret = -EINVAL;
if (mnt_user_ns(path.mnt) != &init_user_ns) {
if (is_idmapped_mnt(path.mnt)) {
pr_warn("File cache on idmapped mounts not supported");
goto error_unsupported;
}
Expand Down
2 changes: 1 addition & 1 deletion fs/ecryptfs/main.c
Original file line number Diff line number Diff line change
Expand Up @@ -537,7 +537,7 @@ static struct dentry *ecryptfs_mount(struct file_system_type *fs_type, int flags
goto out_free;
}

if (mnt_user_ns(path.mnt) != &init_user_ns) {
if (is_idmapped_mnt(path.mnt)) {
rc = -EINVAL;
printk(KERN_ERR "Mounting on idmapped mounts currently disallowed\n");
goto out_free;
Expand Down
19 changes: 3 additions & 16 deletions fs/ksmbd/smbacl.c
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
#include <linux/fs.h>
#include <linux/slab.h>
#include <linux/string.h>
#include <linux/mnt_idmapping.h>

#include "smbacl.h"
#include "smb_common.h"
Expand Down Expand Up @@ -274,14 +275,7 @@ static int sid_to_id(struct user_namespace *user_ns,
uid_t id;

id = le32_to_cpu(psid->sub_auth[psid->num_subauth - 1]);
/*
* Translate raw sid into kuid in the server's user
* namespace.
*/
uid = make_kuid(&init_user_ns, id);

/* If this is an idmapped mount, apply the idmapping. */
uid = kuid_from_mnt(user_ns, uid);
uid = mapped_kuid_user(user_ns, &init_user_ns, KUIDT_INIT(id));
if (uid_valid(uid)) {
fattr->cf_uid = uid;
rc = 0;
Expand All @@ -291,14 +285,7 @@ static int sid_to_id(struct user_namespace *user_ns,
gid_t id;

id = le32_to_cpu(psid->sub_auth[psid->num_subauth - 1]);
/*
* Translate raw sid into kgid in the server's user
* namespace.
*/
gid = make_kgid(&init_user_ns, id);

/* If this is an idmapped mount, apply the idmapping. */
gid = kgid_from_mnt(user_ns, gid);
gid = mapped_kgid_user(user_ns, &init_user_ns, KGIDT_INIT(id));
if (gid_valid(gid)) {
fattr->cf_gid = gid;
rc = 0;
Expand Down
5 changes: 3 additions & 2 deletions fs/ksmbd/smbacl.h
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
#include <linux/fs.h>
#include <linux/namei.h>
#include <linux/posix_acl.h>
#include <linux/mnt_idmapping.h>

#include "mgmt/tree_connect.h"

Expand Down Expand Up @@ -216,7 +217,7 @@ static inline uid_t posix_acl_uid_translate(struct user_namespace *mnt_userns,
kuid_t kuid;

/* If this is an idmapped mount, apply the idmapping. */
kuid = kuid_into_mnt(mnt_userns, pace->e_uid);
kuid = mapped_kuid_fs(mnt_userns, &init_user_ns, pace->e_uid);

/* Translate the kuid into a userspace id ksmbd would see. */
return from_kuid(&init_user_ns, kuid);
Expand All @@ -228,7 +229,7 @@ static inline gid_t posix_acl_gid_translate(struct user_namespace *mnt_userns,
kgid_t kgid;

/* If this is an idmapped mount, apply the idmapping. */
kgid = kgid_into_mnt(mnt_userns, pace->e_gid);
kgid = mapped_kgid_fs(mnt_userns, &init_user_ns, pace->e_gid);

/* Translate the kgid into a userspace id ksmbd would see. */
return from_kgid(&init_user_ns, kgid);
Expand Down
53 changes: 39 additions & 14 deletions fs/namespace.c
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@
#include <uapi/linux/mount.h>
#include <linux/fs_context.h>
#include <linux/shmem_fs.h>
#include <linux/mnt_idmapping.h>

#include "pnode.h"
#include "internal.h"
Expand Down Expand Up @@ -561,7 +562,7 @@ static void free_vfsmnt(struct mount *mnt)
struct user_namespace *mnt_userns;

mnt_userns = mnt_user_ns(&mnt->mnt);
if (mnt_userns != &init_user_ns)
if (!initial_idmapping(mnt_userns))
put_user_ns(mnt_userns);
kfree_const(mnt->mnt_devname);
#ifdef CONFIG_SMP
Expand Down Expand Up @@ -965,6 +966,7 @@ static struct mount *skip_mnt_tree(struct mount *p)
struct vfsmount *vfs_create_mount(struct fs_context *fc)
{
struct mount *mnt;
struct user_namespace *fs_userns;

if (!fc->root)
return ERR_PTR(-EINVAL);
Expand All @@ -982,6 +984,10 @@ struct vfsmount *vfs_create_mount(struct fs_context *fc)
mnt->mnt_mountpoint = mnt->mnt.mnt_root;
mnt->mnt_parent = mnt;

fs_userns = mnt->mnt.mnt_sb->s_user_ns;
if (!initial_idmapping(fs_userns))
mnt->mnt.mnt_userns = get_user_ns(fs_userns);

lock_mount_hash();
list_add_tail(&mnt->mnt_instance, &mnt->mnt.mnt_sb->s_mounts);
unlock_mount_hash();
Expand Down Expand Up @@ -1072,7 +1078,7 @@ static struct mount *clone_mnt(struct mount *old, struct dentry *root,

atomic_inc(&sb->s_active);
mnt->mnt.mnt_userns = mnt_user_ns(&old->mnt);
if (mnt->mnt.mnt_userns != &init_user_ns)
if (!initial_idmapping(mnt->mnt.mnt_userns))
mnt->mnt.mnt_userns = get_user_ns(mnt->mnt.mnt_userns);
mnt->mnt.mnt_sb = sb;
mnt->mnt.mnt_root = dget(root);
Expand Down Expand Up @@ -3927,28 +3933,32 @@ static unsigned int recalc_flags(struct mount_kattr *kattr, struct mount *mnt)
static int can_idmap_mount(const struct mount_kattr *kattr, struct mount *mnt)
{
struct vfsmount *m = &mnt->mnt;
struct user_namespace *fs_userns = m->mnt_sb->s_user_ns;

if (!kattr->mnt_userns)
return 0;

/*
* Creating an idmapped mount with the filesystem wide idmapping
* doesn't make sense so block that. We don't allow mushy semantics.
*/
if (kattr->mnt_userns == fs_userns)
return -EINVAL;

/*
* Once a mount has been idmapped we don't allow it to change its
* mapping. It makes things simpler and callers can just create
* another bind-mount they can idmap if they want to.
*/
if (mnt_user_ns(m) != &init_user_ns)
if (is_idmapped_mnt(m))
return -EPERM;

/* The underlying filesystem doesn't support idmapped mounts yet. */
if (!(m->mnt_sb->s_type->fs_flags & FS_ALLOW_IDMAP))
return -EINVAL;

/* Don't yet support filesystem mountable in user namespaces. */
if (m->mnt_sb->s_user_ns != &init_user_ns)
return -EINVAL;

/* We're not controlling the superblock. */
if (!capable(CAP_SYS_ADMIN))
if (!ns_capable(fs_userns, CAP_SYS_ADMIN))
return -EPERM;

/* Mount has already been visible in the filesystem hierarchy. */
Expand Down Expand Up @@ -4002,14 +4012,27 @@ static struct mount *mount_setattr_prepare(struct mount_kattr *kattr,

static void do_idmap_mount(const struct mount_kattr *kattr, struct mount *mnt)
{
struct user_namespace *mnt_userns;
struct user_namespace *mnt_userns, *old_mnt_userns;

if (!kattr->mnt_userns)
return;

/*
* We're the only ones able to change the mount's idmapping. So
* mnt->mnt.mnt_userns is stable and we can retrieve it directly.
*/
old_mnt_userns = mnt->mnt.mnt_userns;

mnt_userns = get_user_ns(kattr->mnt_userns);
/* Pairs with smp_load_acquire() in mnt_user_ns(). */
smp_store_release(&mnt->mnt.mnt_userns, mnt_userns);

/*
* If this is an idmapped filesystem drop the reference we've taken
* in vfs_create_mount() before.
*/
if (!initial_idmapping(old_mnt_userns))
put_user_ns(old_mnt_userns);
}

static void mount_setattr_commit(struct mount_kattr *kattr,
Expand Down Expand Up @@ -4133,13 +4156,15 @@ static int build_mount_idmapped(const struct mount_attr *attr, size_t usize,
}

/*
* The init_user_ns is used to indicate that a vfsmount is not idmapped.
* This is simpler than just having to treat NULL as unmapped. Users
* wanting to idmap a mount to init_user_ns can just use a namespace
* with an identity mapping.
* The initial idmapping cannot be used to create an idmapped
* mount. We use the initial idmapping as an indicator of a mount
* that is not idmapped. It can simply be passed into helpers that
* are aware of idmapped mounts as a convenient shortcut. A user
* can just create a dedicated identity mapping to achieve the same
* result.
*/
mnt_userns = container_of(ns, struct user_namespace, ns);
if (mnt_userns == &init_user_ns) {
if (initial_idmapping(mnt_userns)) {
err = -EPERM;
goto out_fput;
}
Expand Down
2 changes: 1 addition & 1 deletion fs/nfsd/export.c
Original file line number Diff line number Diff line change
Expand Up @@ -427,7 +427,7 @@ static int check_export(struct path *path, int *flags, unsigned char *uuid)
return -EINVAL;
}

if (mnt_user_ns(path->mnt) != &init_user_ns) {
if (is_idmapped_mnt(path->mnt)) {
dprintk("exp_export: export of idmapped mounts not yet supported.\n");
return -EINVAL;
}
Expand Down
8 changes: 5 additions & 3 deletions fs/open.c
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@
#include <linux/ima.h>
#include <linux/dnotify.h>
#include <linux/compat.h>
#include <linux/mnt_idmapping.h>

#include "internal.h"

Expand Down Expand Up @@ -640,7 +641,7 @@ SYSCALL_DEFINE2(chmod, const char __user *, filename, umode_t, mode)

int chown_common(const struct path *path, uid_t user, gid_t group)
{
struct user_namespace *mnt_userns;
struct user_namespace *mnt_userns, *fs_userns;
struct inode *inode = path->dentry->d_inode;
struct inode *delegated_inode = NULL;
int error;
Expand All @@ -652,8 +653,9 @@ int chown_common(const struct path *path, uid_t user, gid_t group)
gid = make_kgid(current_user_ns(), group);

mnt_userns = mnt_user_ns(path->mnt);
uid = kuid_from_mnt(mnt_userns, uid);
gid = kgid_from_mnt(mnt_userns, gid);
fs_userns = i_user_ns(inode);
uid = mapped_kuid_user(mnt_userns, fs_userns, uid);
gid = mapped_kgid_user(mnt_userns, fs_userns, gid);

retry_deleg:
newattrs.ia_valid = ATTR_CTIME;
Expand Down
2 changes: 1 addition & 1 deletion fs/overlayfs/super.c
Original file line number Diff line number Diff line change
Expand Up @@ -873,7 +873,7 @@ static int ovl_mount_dir_noesc(const char *name, struct path *path)
pr_err("filesystem on '%s' not supported\n", name);
goto out_put;
}
if (mnt_user_ns(path->mnt) != &init_user_ns) {
if (is_idmapped_mnt(path->mnt)) {
pr_err("idmapped layers are currently not supported\n");
goto out_put;
}
Expand Down
17 changes: 11 additions & 6 deletions fs/posix_acl.c
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@
#include <linux/export.h>
#include <linux/user_namespace.h>
#include <linux/namei.h>
#include <linux/mnt_idmapping.h>

static struct posix_acl **acl_by_type(struct inode *inode, int type)
{
Expand Down Expand Up @@ -374,7 +375,9 @@ posix_acl_permission(struct user_namespace *mnt_userns, struct inode *inode,
goto check_perm;
break;
case ACL_USER:
uid = kuid_into_mnt(mnt_userns, pa->e_uid);
uid = mapped_kuid_fs(mnt_userns,
i_user_ns(inode),
pa->e_uid);
if (uid_eq(uid, current_fsuid()))
goto mask;
break;
Expand All @@ -387,7 +390,9 @@ posix_acl_permission(struct user_namespace *mnt_userns, struct inode *inode,
}
break;
case ACL_GROUP:
gid = kgid_into_mnt(mnt_userns, pa->e_gid);
gid = mapped_kgid_fs(mnt_userns,
i_user_ns(inode),
pa->e_gid);
if (in_group_p(gid)) {
found = 1;
if ((pa->e_perm & want) == want)
Expand Down Expand Up @@ -734,17 +739,17 @@ static void posix_acl_fix_xattr_userns(
case ACL_USER:
uid = make_kuid(from, le32_to_cpu(entry->e_id));
if (from_user)
uid = kuid_from_mnt(mnt_userns, uid);
uid = mapped_kuid_user(mnt_userns, &init_user_ns, uid);
else
uid = kuid_into_mnt(mnt_userns, uid);
uid = mapped_kuid_fs(mnt_userns, &init_user_ns, uid);
entry->e_id = cpu_to_le32(from_kuid(to, uid));
break;
case ACL_GROUP:
gid = make_kgid(from, le32_to_cpu(entry->e_id));
if (from_user)
gid = kgid_from_mnt(mnt_userns, gid);
gid = mapped_kgid_user(mnt_userns, &init_user_ns, gid);
else
gid = kgid_into_mnt(mnt_userns, gid);
gid = mapped_kgid_fs(mnt_userns, &init_user_ns, gid);
entry->e_id = cpu_to_le32(from_kgid(to, gid));
break;
default:
Expand Down
Loading

0 comments on commit 5dfbfe7

Please sign in to comment.