Skip to content

Commit

Permalink
Btrfs: add support for inode properties
Browse files Browse the repository at this point in the history
This change adds infrastructure to allow for generic properties for
inodes. Properties are name/value pairs that can be associated with
inodes for different purposes. They are stored as xattrs with the
prefix "btrfs."

Properties can be inherited - this means when a directory inode has
inheritable properties set, these are added to new inodes created
under that directory. Further, subvolumes can also have properties
associated with them, and they can be inherited from their parent
subvolume. Naturally, directory properties have priority over subvolume
properties (in practice a subvolume property is just a regular
property associated with the root inode, objectid 256, of the
subvolume's fs tree).

This change also adds one specific property implementation, named
"compression", whose values can be "lzo" or "zlib" and it's an
inheritable property.

The corresponding changes to btrfs-progs were also implemented.
A patch with xfstests for this feature will follow once there's
agreement on this change/feature.

Further, the script at the bottom of this commit message was used to
do some benchmarks to measure any performance penalties of this feature.

Basically the tests correspond to:

Test 1 - create a filesystem and mount it with compress-force=lzo,
then sequentially create N files of 64Kb each, measure how long it took
to create the files, unmount the filesystem, mount the filesystem and
perform an 'ls -lha' against the test directory holding the N files, and
report the time the command took.

Test 2 - create a filesystem and don't use any compression option when
mounting it - instead set the compression property of the subvolume's
root to 'lzo'. Then create N files of 64Kb, and report the time it took.
The unmount the filesystem, mount it again and perform an 'ls -lha' like
in the former test. This means every single file ends up with a property
(xattr) associated to it.

Test 3 - same as test 2, but uses 4 properties - 3 are duplicates of the
compression property, have no real effect other than adding more work
when inheriting properties and taking more btree leaf space.

Test 4 - same as test 3 but with 10 properties per file.

Results (in seconds, and averages of 5 runs each), for different N
numbers of files follow.

* Without properties (test 1)

                    file creation time        ls -lha time
10 000 files              3.49                   0.76
100 000 files            47.19                   8.37
1 000 000 files         518.51                 107.06

* With 1 property (compression property set to lzo - test 2)

                    file creation time        ls -lha time
10 000 files              3.63                    0.93
100 000 files            48.56                    9.74
1 000 000 files         537.72                  125.11

* With 4 properties (test 3)

                    file creation time        ls -lha time
10 000 files              3.94                    1.20
100 000 files            52.14                   11.48
1 000 000 files         572.70                  142.13

* With 10 properties (test 4)

                    file creation time        ls -lha time
10 000 files              4.61                    1.35
100 000 files            58.86                   13.83
1 000 000 files         656.01                  177.61

The increased latencies with properties are essencialy because of:

*) When creating an inode, we now synchronously write 1 more item
   (an xattr item) for each property inherited from the parent dir
   (or subvolume). This could be done in an asynchronous way such
   as we do for dir intex items (delayed-inode.c), which could help
   reduce the file creation latency;

*) With properties, we now have larger fs trees. For this particular
   test each xattr item uses 75 bytes of leaf space in the fs tree.
   This could be less by using a new item for xattr items, instead of
   the current btrfs_dir_item, since we could cut the 'location' and
   'type' fields (saving 18 bytes) and maybe 'transid' too (saving a
   total of 26 bytes per xattr item) from the btrfs_dir_item type.

Also tried batching the xattr insertions (ignoring proper hash
collision handling, since it didn't exist) when creating files that
inherit properties from their parent inode/subvolume, but the end
results were (surprisingly) essentially the same.

Test script:

$ cat test.pl
  #!/usr/bin/perl -w

  use strict;
  use Time::HiRes qw(time);
  use constant NUM_FILES => 10_000;
  use constant FILE_SIZES => (64 * 1024);
  use constant DEV => '/dev/sdb4';
  use constant MNT_POINT => '/home/fdmanana/btrfs-tests/dev';
  use constant TEST_DIR => (MNT_POINT . '/testdir');

  system("mkfs.btrfs", "-l", "16384", "-f", DEV) == 0 or die "mkfs.btrfs failed!";

  # following line for testing without properties
  #system("mount", "-o", "compress-force=lzo", DEV, MNT_POINT) == 0 or die "mount failed!";

  # following 2 lines for testing with properties
  system("mount", DEV, MNT_POINT) == 0 or die "mount failed!";
  system("btrfs", "prop", "set", MNT_POINT, "compression", "lzo") == 0 or die "set prop failed!";

  system("mkdir", TEST_DIR) == 0 or die "mkdir failed!";
  my ($t1, $t2);

  $t1 = time();
  for (my $i = 1; $i <= NUM_FILES; $i++) {
      my $p = TEST_DIR . '/file_' . $i;
      open(my $f, '>', $p) or die "Error opening file!";
      $f->autoflush(1);
      for (my $j = 0; $j < FILE_SIZES; $j += 4096) {
          print $f ('A' x 4096) or die "Error writing to file!";
      }
      close($f);
  }
  $t2 = time();
  print "Time to create " . NUM_FILES . ": " . ($t2 - $t1) . " seconds.\n";
  system("umount", DEV) == 0 or die "umount failed!";
  system("mount", DEV, MNT_POINT) == 0 or die "mount failed!";

  $t1 = time();
  system("bash -c 'ls -lha " . TEST_DIR . " > /dev/null'") == 0 or die "ls failed!";
  $t2 = time();
  print "Time to ls -lha all files: " . ($t2 - $t1) . " seconds.\n";
  system("umount", DEV) == 0 or die "umount failed!";

Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com>
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
  • Loading branch information
fdmanana authored and masoncl committed Jan 28, 2014
1 parent 1acae57 commit 6354192
Show file tree
Hide file tree
Showing 10 changed files with 545 additions and 10 deletions.
2 changes: 1 addition & 1 deletion fs/btrfs/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ btrfs-y += super.o ctree.o extent-tree.o print-tree.o root-tree.o dir-item.o \
export.o tree-log.o free-space-cache.o zlib.o lzo.o \
compression.o delayed-ref.o relocation.o delayed-inode.o scrub.o \
reada.o backref.o ulist.o qgroup.o send.o dev-replace.o raid56.o \
uuid-tree.o
uuid-tree.o props.o

btrfs-$(CONFIG_BTRFS_FS_POSIX_ACL) += acl.o
btrfs-$(CONFIG_BTRFS_FS_CHECK_INTEGRITY) += check-integrity.o
Expand Down
1 change: 1 addition & 0 deletions fs/btrfs/btrfs_inode.h
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@
#define BTRFS_INODE_COPY_EVERYTHING 8
#define BTRFS_INODE_IN_DELALLOC_LIST 9
#define BTRFS_INODE_READDIO_NEED_LOCK 10
#define BTRFS_INODE_HAS_PROPS 11

/* in memory btrfs inode */
struct btrfs_inode {
Expand Down
4 changes: 3 additions & 1 deletion fs/btrfs/ctree.h
Original file line number Diff line number Diff line change
Expand Up @@ -3703,7 +3703,9 @@ int btrfs_start_delalloc_roots(struct btrfs_fs_info *fs_info, int delay_iput);
int btrfs_set_extent_delalloc(struct inode *inode, u64 start, u64 end,
struct extent_state **cached_state);
int btrfs_create_subvol_root(struct btrfs_trans_handle *trans,
struct btrfs_root *new_root, u64 new_dirid);
struct btrfs_root *new_root,
struct btrfs_root *parent_root,
u64 new_dirid);
int btrfs_merge_bio_hook(int rw, struct page *page, unsigned long offset,
size_t size, struct bio *bio,
unsigned long bio_flags);
Expand Down
42 changes: 36 additions & 6 deletions fs/btrfs/inode.c
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,7 @@
#include "inode-map.h"
#include "backref.h"
#include "hash.h"
#include "props.h"

struct btrfs_iget_args {
u64 ino;
Expand Down Expand Up @@ -3265,7 +3266,8 @@ int btrfs_orphan_cleanup(struct btrfs_root *root)
* slot is the slot the inode is in, objectid is the objectid of the inode
*/
static noinline int acls_after_inode_item(struct extent_buffer *leaf,
int slot, u64 objectid)
int slot, u64 objectid,
int *first_xattr_slot)
{
u32 nritems = btrfs_header_nritems(leaf);
struct btrfs_key found_key;
Expand All @@ -3281,6 +3283,7 @@ static noinline int acls_after_inode_item(struct extent_buffer *leaf,
}

slot++;
*first_xattr_slot = -1;
while (slot < nritems) {
btrfs_item_key_to_cpu(leaf, &found_key, slot);

Expand All @@ -3290,6 +3293,8 @@ static noinline int acls_after_inode_item(struct extent_buffer *leaf,

/* we found an xattr, assume we've got an acl */
if (found_key.type == BTRFS_XATTR_ITEM_KEY) {
if (*first_xattr_slot == -1)
*first_xattr_slot = slot;
if (found_key.offset == xattr_access ||
found_key.offset == xattr_default)
return 1;
Expand Down Expand Up @@ -3318,6 +3323,8 @@ static noinline int acls_after_inode_item(struct extent_buffer *leaf,
* something larger than an xattr. We have to assume the inode
* has acls
*/
if (*first_xattr_slot == -1)
*first_xattr_slot = slot;
return 1;
}

Expand All @@ -3337,6 +3344,7 @@ static void btrfs_read_locked_inode(struct inode *inode)
u32 rdev;
int ret;
bool filled = false;
int first_xattr_slot;

ret = btrfs_fill_inode(inode, &rdev);
if (!ret)
Expand All @@ -3346,7 +3354,6 @@ static void btrfs_read_locked_inode(struct inode *inode)
if (!path)
goto make_bad;

path->leave_spinning = 1;
memcpy(&location, &BTRFS_I(inode)->location, sizeof(location));

ret = btrfs_lookup_inode(NULL, root, path, &location, 0);
Expand Down Expand Up @@ -3429,12 +3436,21 @@ static void btrfs_read_locked_inode(struct inode *inode)
* any xattrs or acls
*/
maybe_acls = acls_after_inode_item(leaf, path->slots[0],
btrfs_ino(inode));
btrfs_ino(inode), &first_xattr_slot);
if (first_xattr_slot != -1) {
path->slots[0] = first_xattr_slot;
ret = btrfs_load_inode_props(inode, path);
if (ret)
btrfs_err(root->fs_info,
"error loading props for ino %llu (root %llu): %d\n",
btrfs_ino(inode),
root->root_key.objectid, ret);
}
btrfs_free_path(path);

if (!maybe_acls)
cache_no_acl(inode);

btrfs_free_path(path);

switch (inode->i_mode & S_IFMT) {
case S_IFREG:
inode->i_mapping->a_ops = &btrfs_aops;
Expand Down Expand Up @@ -5607,6 +5623,12 @@ static struct inode *btrfs_new_inode(struct btrfs_trans_handle *trans,

btrfs_update_root_times(trans, root);

ret = btrfs_inode_inherit_props(trans, inode, dir);
if (ret)
btrfs_err(root->fs_info,
"error inheriting props for ino %llu (root %llu): %d",
btrfs_ino(inode), root->root_key.objectid, ret);

return inode;
fail:
if (dir)
Expand Down Expand Up @@ -7889,7 +7911,9 @@ static int btrfs_truncate(struct inode *inode)
* create a new subvolume directory/inode (helper for the ioctl).
*/
int btrfs_create_subvol_root(struct btrfs_trans_handle *trans,
struct btrfs_root *new_root, u64 new_dirid)
struct btrfs_root *new_root,
struct btrfs_root *parent_root,
u64 new_dirid)
{
struct inode *inode;
int err;
Expand All @@ -7907,6 +7931,12 @@ int btrfs_create_subvol_root(struct btrfs_trans_handle *trans,
set_nlink(inode, 1);
btrfs_i_size_write(inode, 0);

err = btrfs_subvol_inherit_props(trans, new_root, parent_root);
if (err)
btrfs_err(new_root->fs_info,
"error inheriting subvolume %llu properties: %d\n",
new_root->root_key.objectid, err);

err = btrfs_update_inode(trans, new_root, inode);

iput(inode);
Expand Down
19 changes: 18 additions & 1 deletion fs/btrfs/ioctl.c
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,7 @@
#include "rcu-string.h"
#include "send.h"
#include "dev-replace.h"
#include "props.h"
#include "sysfs.h"

static int btrfs_clone(struct inode *src, struct inode *inode,
Expand Down Expand Up @@ -281,9 +282,25 @@ static int btrfs_ioctl_setflags(struct file *file, void __user *arg)
if (flags & FS_NOCOMP_FL) {
ip->flags &= ~BTRFS_INODE_COMPRESS;
ip->flags |= BTRFS_INODE_NOCOMPRESS;

ret = btrfs_set_prop(inode, "btrfs.compression", NULL, 0, 0);
if (ret && ret != -ENODATA)
goto out_drop;
} else if (flags & FS_COMPR_FL) {
const char *comp;

ip->flags |= BTRFS_INODE_COMPRESS;
ip->flags &= ~BTRFS_INODE_NOCOMPRESS;

if (root->fs_info->compress_type == BTRFS_COMPRESS_LZO)
comp = "lzo";
else
comp = "zlib";
ret = btrfs_set_prop(inode, "btrfs.compression",
comp, strlen(comp), 0);
if (ret)
goto out_drop;

} else {
ip->flags &= ~(BTRFS_INODE_COMPRESS | BTRFS_INODE_NOCOMPRESS);
}
Expand Down Expand Up @@ -502,7 +519,7 @@ static noinline int create_subvol(struct inode *dir,

btrfs_record_root_in_trans(trans, new_root);

ret = btrfs_create_subvol_root(trans, new_root, new_dirid);
ret = btrfs_create_subvol_root(trans, new_root, root, new_dirid);
if (ret) {
/* We potentially lose an unused inode item here */
btrfs_abort_transaction(trans, root, ret);
Expand Down
Loading

0 comments on commit 6354192

Please sign in to comment.