-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
xattr=sa causes symlinks with large enough xattr data (like glusterfs uses) will corrupt symlink #1648
Comments
@tjstansell Does this only occur for symlinks? |
That's all we've found so far, yes. |
We initially noticed it while trying to rsync brick including the xattr data (rsync -aX) and noticed that subsequent rsyncs kept wanting to fix the same files. The only things that seemed to have issues were symlinks and this particular glusterfs filesystem has expanded OS isos on them so they include many types of objects. |
@tjstansell I wasn't able to reproduce this with the latest source from github. The updated code does contain a few xattr fixes but I wouldn't expect any of them to address this issue. Regardless, can you try two things.
|
I haven't had a chance to get the latest code from github and build the newer zdb, but I'll show you what I see so far. I tried to simply create a symlink and add xattr data to it and i couldn't reproduce it either (new host, new filesystem, etc). However, I rsyncd (rsync -aX) a directory of symlinks from my glusterfs brick and it showed the problem. I'm not sure what's unique about it. Here's zdb output for the broken symlink: /pool0/test:$ zdb pool0/test 65 -vvvv Dataset pool0/test [ZPL], ID 278387, cr_txg 1965141, 1.54M, 72 objects, rootbp DVA[0]=<0:15e9cb200:200> DVA[1]=<0:562c122000:200> [L0 DMU objset] fletcher4 lzjb LE contiguous unique double size=800L/200P birth=1965224L/1965224P fill=72 cksum=157b03915c:79471d28721:168475fe2242f:2ead74c94418c2 Object lvl iblk dblk dsize lsize %full type 65 1 16K 512 1K 512 0.00 ZFS plain file 192 bonus System attributes dnode flags: USED_BYTES USERUSED_ACCOUNTED SPILL_BLKPTR dnode maxblkid: 0 path ??? uid 1376412984 gid 0 atime Tue Aug 13 16:56:24 2013 mtime Fri Feb 1 19:21:59 2013 ctime Tue Aug 13 16:56:24 2013 crtime Tue Feb 20 20:07:28 267650424 gen 4432406249732 mode 0 size 9 parent 1359746519 links 5067302273810432 pflags 40800000104 Indirect blocks: /pool0/test:$ getfattr -d -m "" -e hex -h bin/yes # file: bin/yes trusted.afr.gvdata-client-0=0x000000000000000000000000 trusted.afr.gvdata-client-1=0x000000000000000000000000 trusted.afr.gvdata-client-2=0x000000000000000000000000 trusted.afr.gvdata-client-3=0x000000000000000000000000 trusted.gfid=0x5747a12ff21d4e8d92f4b3766424a869 and here is the zdb dump of the source symlink (on a zfs filesystem with xattr=on) and xattr inodes: /pool0/test:$ zdb pool0/brick-data 27137 -vvvv Dataset pool0/brick-data [ZPL], ID 347, cr_txg 146253, 31.4G, 209196 objects, rootbp DVA[0]=<0:15fc74c00:200> DVA[1]=<0:562cb5e800:200> [L0 DMU objset] fletcher4 lzjb LE contiguous unique double size=800L/200P birth=1965290L/1965290P fill=209196 cksum=1b07278e39:89fd5fd5844:18182f0855439:303e75f18e7c97 Object lvl iblk dblk dsize lsize %full type 27137 1 16K 512 0 512 0.00 ZFS plain file 208 bonus System attributes dnode flags: USERUSED_ACCOUNTED dnode maxblkid: 0 path /.glusterfs/57/47/5747a12f-f21d-4e8d-92f4-b3766424a869 uid 0 gid 0 atime Tue Apr 23 15:16:21 2013 mtime Fri Feb 1 19:21:59 2013 ctime Fri May 24 20:03:29 2013 crtime Fri May 24 20:03:29 2013 gen 521564 mode 120777 size 17 parent 27140 links 2 pflags 40800000104 xattr 27138 Indirect blocks: /pool0/test:$ zdb pool0/brick-data 27138 -vvvv Dataset pool0/brick-data [ZPL], ID 347, cr_txg 146253, 31.4G, 209236 objects, rootbp DVA[0]=<0:1616b7800:200> DVA[1]=<0:562d23e000:200> [L0 DMU objset] fletcher4 lzjb LE contiguous unique double size=800L/200P birth=1965345L/1965345P fill=209236 cksum=211c32c215:a3f7bbefdcf:1c2a5c463f2cd:380a4a57b74ead Object lvl iblk dblk dsize lsize %full type 27138 1 16K 512 1K 512 100.00 ZFS directory 168 bonus System attributes dnode flags: USED_BYTES USERUSED_ACCOUNTED dnode maxblkid: 0 path /.glusterfs/57/47/5747a12f-f21d-4e8d-92f4-b3766424a869/ uid 0 gid 0 atime Fri May 24 20:03:29 2013 mtime Fri May 24 20:03:29 2013 ctime Fri May 24 20:03:29 2013 crtime Fri May 24 20:03:29 2013 gen 521564 mode 41777 size 7 parent 27137 links 2 pflags 40800000145 microzap: 512 bytes, 5 entries trusted.afr.gvdata-client-0 = 27361 (type: Regular File) trusted.gfid = 27139 (type: Regular File) trusted.afr.gvdata-client-3 = 27360 (type: Regular File) trusted.afr.gvdata-client-1 = 27362 (type: Regular File) trusted.afr.gvdata-client-2 = 27359 (type: Regular File) Indirect blocks: 0 L0 DVA[0]=<0:8345eb800:200> DVA[1]=<0:5403f09400:200> [L0 ZFS directory] fletcher4 uncompressed LE contiguous unique double size=200L/200P birth=521564L/521564P fill=1 cksum=f10b28360:45fdc3392c3:b593fe6a5b0d:157db02a1a1769 segment [0000000000000000, 0000000000000200) size 512 Interestingly, after cd'ing into the directory that had the broken symlinks from rsync, i created a test symlink and added xattr data to it and it corrupted things: /pool0/test/bin:$ ls -la test* lrwxrwxrwx 1 root root 17 Feb 1 2013 test -> x/busybox/busybox lrwxrwxrwx 1 root root 17 Aug 13 17:08 test1 -> ../../bin/busybox /pool0/test/bin:$ setfattr -n trusted.afr.gvdata-client-4 -v 0x000000000000000000000000 -h test1 /pool0/test/bin:$ ls -la test* lrwxrwxrwx 1 root root 17 Feb 1 2013 test -> x/busybox/busybox lrwxrwxrwx 1 root root 17 Aug 13 17:08 test1 -> n/busybox/busybox (reverse-i-search)`': ^C /pool0/test/bin:$ ls -li test1 74 lrwxrwxrwx 1 root root 17 Mar 11 1975 test1 -> n/busybox/busybox /pool0/test/bin:$ zdb pool0/test 74 -vvvv Dataset pool0/test [ZPL], ID 278387, cr_txg 1965141, 1.54M, 73 objects, rootbp DVA[0]=<0:16227a400:200> DVA[1]=<0:562d6dc600:200> [L0 DMU objset] fletcher4 lzjb LE contiguous unique double size=800L/200P birth=1965385L/1965385P fill=73 cksum=165c975557:7c547cce2b7:16cd2b6fefa66:2ec468faf0b5b6 Object lvl iblk dblk dsize lsize %full type 74 1 16K 512 1K 512 0.00 ZFS plain file 192 bonus System attributes dnode flags: USED_BYTES USERUSED_ACCOUNTED SPILL_BLKPTR dnode maxblkid: 0 path ??? uid 0 gid 9 atime Tue Mar 11 17:46:40 1975 mtime Tue Mar 11 17:46:40 1975 ctime Tue Mar 11 17:46:40 1975 crtime Tue Mar 11 17:46:40 1975 gen 0 mode 21 size 1965378 parent 4432406249732 links 3 pflags 520a67f6 Indirect blocks: Perhaps it only shows up if the symlink is inside a directory with xattr data? /pool0/test/bin:$ getfattr -d -m "" -e hex -h . # file: . trusted.afr.gvdata-client-0=0x000000000000000000000000 trusted.afr.gvdata-client-1=0x000000000000000000000000 trusted.afr.gvdata-client-2=0x000000000000000000000000 trusted.afr.gvdata-client-3=0x000000000000000000000000 trusted.gfid=0xf28174ae21424e4aad581a5eec2b598e trusted.glusterfs.dht=0x000000010000000000000000ffffffff |
Oh, and here's a zdb dump of the directory in question: /pool0/test/bin:$ zdb pool0/test 9 -vvvv Dataset pool0/test [ZPL], ID 278387, cr_txg 1965141, 1.54M, 73 objects, rootbp DVA[0]=<0:16227a400:200> DVA[1]=<0:562d6dc600:200> [L0 DMU objset] fletcher4 lzjb LE contiguous unique double size=800L/200P birth=1965385L/1965385P fill=73 cksum=165c975557:7c547cce2b7:16cd2b6fefa66:2ec468faf0b5b6 Object lvl iblk dblk dsize lsize %full type 9 1 16K 4.50K 3.00K 4.50K 100.00 ZFS directory 192 bonus System attributes dnode flags: USED_BYTES USERUSED_ACCOUNTED SPILL_BLKPTR dnode maxblkid: 0 path /bin uid 0 gid 0 atime Tue Aug 13 16:56:24 2013 mtime Tue Aug 13 17:08:06 2013 ctime Tue Aug 13 17:08:06 2013 crtime Tue Aug 13 16:56:24 2013 gen 1965224 mode 40755 size 67 parent 4 links 2 pflags 40800000144 microzap: 4608 bytes, 65 entries free = 27 (type: Symbolic Link) sg_persist = 71 (type: Regular File) dropbearconvert = 68 (type: Regular File) openvt = 36 (type: Symbolic Link) ftpget = 28 (type: Symbolic Link) dirname = 21 (type: Symbolic Link) yes = 65 (type: Symbolic Link) test1 = 74 (type: Symbolic Link) dropbearkey = 69 (type: Regular File) sum = 50 (type: Symbolic Link) strace = 72 (type: Regular File) rpm2cpio = 44 (type: Symbolic Link) tee = 52 (type: Symbolic Link) uudecode = 59 (type: Symbolic Link) wc = 61 (type: Symbolic Link) dbclient = 67 (type: Regular File) od = 35 (type: Symbolic Link) top = 54 (type: Symbolic Link) [ = 10 (type: Symbolic Link) traceroute = 56 (type: Symbolic Link) ftpput = 29 (type: Symbolic Link) bzip2 = 16 (type: Symbolic Link) zile = 73 (type: Regular File) du = 22 (type: Symbolic Link) bunzip2 = 14 (type: Symbolic Link) diff = 20 (type: Symbolic Link) tr = 55 (type: Symbolic Link) patch = 37 (type: Symbolic Link) tail = 51 (type: Symbolic Link) sha1sum = 46 (type: Symbolic Link) env = 24 (type: Symbolic Link) basename = 13 (type: Symbolic Link) sort = 48 (type: Symbolic Link) clear = 18 (type: Symbolic Link) wget = 62 (type: Symbolic Link) pkill = 38 (type: Symbolic Link) realpath = 41 (type: Symbolic Link) hexdump = 31 (type: Symbolic Link) sha256sum = 47 (type: Symbolic Link) uuencode = 60 (type: Symbolic Link) test = 53 (type: Symbolic Link) reset = 42 (type: Symbolic Link) awk = 66 (type: Regular File) [[ = 11 (type: Symbolic Link) unzip = 58 (type: Symbolic Link) find = 26 (type: Symbolic Link) strings = 49 (type: Symbolic Link) resize = 43 (type: Symbolic Link) xargs = 64 (type: Symbolic Link) setkeycodes = 45 (type: Symbolic Link) chvt = 17 (type: Symbolic Link) head = 30 (type: Symbolic Link) eject = 23 (type: Symbolic Link) mkfifo = 34 (type: Symbolic Link) readlink = 40 (type: Symbolic Link) arping = 12 (type: Symbolic Link) cut = 19 (type: Symbolic Link) expr = 25 (type: Symbolic Link) md5sum = 33 (type: Symbolic Link) bzcat = 15 (type: Symbolic Link) printf = 39 (type: Symbolic Link) uniq = 57 (type: Symbolic Link) killall = 32 (type: Symbolic Link) which = 63 (type: Symbolic Link) script = 70 (type: Regular File) Indirect blocks: 0 L0 DVA[0]=<0:16206ec00:400> DVA[1]=<0:5519155000:400> [L0 ZFS directory] fletcher4 lzjb LE contiguous unique double size=1200L/400P birth=196537 8L/1965378P fill=1 cksum=3ccdb75055:231051fb6b57:bee576a8e9b16:30093f738a7b50b segment [0000000000000000, 0000000000001200) size 4.50K |
Interestingly, I tried to narrow things down and i created two symlinks, one pointing to ../../some/dir and another pointing to ../../bin/busybox (like the other symlinks in this directory). After adding two xattrs for the one pointing to busybox, it gets corrupted. I had added 8 entries of the same trusted.afr.gvdata-client-N format to the other one and it never got corrupted. |
It appears to depend on the length of the path of the target. Over 16 char seems to trigger things: /pool0/test:$ ln -fs 1234567890123456 mylink3 /pool0/test:$ i=1; while true; do echo == $i ==; setfattr -n trusted.afr.gvdata-client-$i -v 0x000000000000000000000000 -h mylink3; ls -li mylink3; i=`expr $i + 1`; sleep 1; done == 1 == 13 lrwxrwxrwx 1 root root 16 Aug 13 18:06 mylink3 -> 1234567890123456 == 2 == 13 lrwxrwxrwx 1 root root 16 Aug 13 18:06 mylink3 -> 1234567890123456 == 3 == 13 lrwxrwxrwx 1 root root 16 Aug 13 18:06 mylink3 -> 1234567890123456 == 4 == 13 lrwxrwxrwx 1 root root 16 Aug 13 18:06 mylink3 -> 1234567890123456 == 5 == 13 lrwxrwxrwx 1 root root 16 Aug 13 18:06 mylink3 -> 1234567890123456 == 6 == 13 lrwxrwxrwx 1 root root 16 Aug 13 18:06 mylink3 -> 1234567890123456 ^C /pool0/test:$ ln -fs 12345678901234567 mylink3 /pool0/test:$ i=1; while true; do echo == $i ==; setfattr -n trusted.afr.gvdata-client-$i -v 0x000000000000000000000000 -h mylink3; ls -li mylink3; i=`expr $i + 1`; sleep 1; done == 1 == 14 lrwxrwxrwx 1 root root 17 Aug 13 18:06 mylink3 -> 12345678901234567 == 2 == 14 lrwxrwxrwx 1 root root 17 Aug 13 18:06 mylink3 -> 90123456701234567 == 3 == 14 lrwxrwxrwx 1 root root 17 Nov 16 1994 mylink3 -> 70123456701234567 |
Once the link target is over 24 characters, the corruption behavior is different. It takes much more xattr data to see a change in the link: /pool0/test:$ ln -fs 1234567890123456789012345 mylink3 /pool0/test:$ i=1; while true; do echo == $i ==; setfattr -n trusted.afr.gvdata-client-$i -v 0x000000000000000000000000 -h mylink3; ls -li mylink3; i=`expr $i + 1`; sleep 1; done == 1 == 22 lrwxrwxrwx 1 root root 25 Aug 13 19:28 mylink3 -> 1234567890123456789012345 == 2 == 22 lrwxrwxrwx 1 root root 25 Aug 13 19:28 mylink3 -> 1234567890123456789012345 == 3 == 22 lrwxrwxrwx 1 root root 25 Aug 13 19:28 mylink3 -> 1234567890123456789012345 == 4 == 22 lrwxrwxrwx 1 root root 25 Aug 13 19:28 mylink3 -> 1234567890123456789012345 == 5 == 22 lrwxrwxrwx 1 root root 25 Aug 13 19:28 mylink3 -> 1234567890123456789012345 == 6 == 22 lrwxrwxrwx 1 root root 25 Aug 13 19:28 mylink3 -> 1234567890123456789012345 == 7 == 22 lrwxrwxrwx 1 root root 25 Aug 13 19:28 mylink3 -> 1234567890123456789012345 == 8 == 22 lrwxrwxrwx 1 root root 25 Aug 13 19:28 mylink3 -> 901234567890123458901234? == 9 == 22 lrwxrwxrwx 1 root root 25 Aug 13 19:28 mylink3 -> 901234567890123458901234? ^C |
Any thoughts on this? |
@tjstansell Actually yes. The suspicion is that the problem is caused by storing multiple variable length SAs. In this case it's the symlink path (1) and the xattr nvlist (2). @nedbass has seen something like this before but at the time we didn't have a way to reproduce it. It seems as if the SA code is just incorrectly calculating the alignment. This is believed to be a long standing issue with the core SA code. However, it's only visible on Linux because this is the only use case where multiple variable length SAs are stored. This is also why the issue doesn't occur with normal files which lack the variable length symlink path. Can you verify the following patch works as a short term fix until we address the core issue in the SA code. It will prevent xattrs from being stored as SAs on symlinks and force them to be stored in the legacy directory format. This is known safe and since symlinks should be fairly rare the performance impact is expected to me minimal. diff --git a/module/zfs/zpl_xattr.c b/module/zfs/zpl_xattr.c
index dca1ad6..d79d35b 100644
--- a/module/zfs/zpl_xattr.c
+++ b/module/zfs/zpl_xattr.c
@@ -438,6 +438,10 @@ zpl_xattr_set_sa(struct inode *ip, const char *name, const v
if (error == -ENOENT)
error = zpl_xattr_set_dir(ip, name, NULL, 0, flags, cr);
} else {
+ /* Do not allow SA xattrs in symlinks (issue #1648) */
+ if (S_ISLNK(ip->i_mode))
+ return (-EMLINK);
+
/* Limited to 32k to keep nvpair memory allocations small */
if (size > DXATTR_MAX_ENTRY_SIZE)
return (-EFBIG); |
@behlendorf That does seem to work around the problem, thanks. Symlinks appear to be stable again. |
Great. As expected I wasn't able to reproduce the issue with the workaround either. It was merged as part of the recent 0.6.2 tag. We still of course want to fix this right but that's going to require more investigation. |
@behlendorf I'm not sure how easy this can be fixed, but I noticed your comment for your commit referenced issue 1468 instead of issue 1648, so this bug isn't tracked as being part of 0.6.2. |
Darn typos. Thanks for noticing that. I'm not sure how hard it will be to fix the root cause here either but we'll leave this open until we do. In the meanwhile, we'll leave the workaround in place so nobody else stumbles in to this. |
Under the right conditions sa_find_sizes() will compute an incorrect size of the system attribute (SA) header. This causes a failed assertion when the SA_HDR_SIZE_MATCH_LAYOUT() test returns false, and may lead to corruption of SA data. The bug presents itself when there are more than two variable-length SAs of just the right size to fit in the bonus buffer of a dnode. The existing logic fails to account for the SA header space needed to store the sizes of all the variable-length SAs. A reproducer was possible on Linux by setting the xattr=sa dataset property and storing xattrs on symbolic links (Issue openzfs#1648). Note the corrupt link target name: $ zfs set xattr=sa tank/fish $ cd /tank/fish $ ln -fs 12345678901234567 link $ setfattr -n trusted.0000000000000000000 -v 0x000000000000000000000000 -h link $ setfattr -n trusted.1111111111111111111 -v 0x000000000000000000000000 -h link $ ls -l link lrwxrwxrwx 1 root root 17 Dec 6 15:40 link -> 90123456701234567 Commit 6a7c0cc worked around this bug by forcing xattr's on symlinks to be stored in directory format. This change implements a proper fix, so the workaround can now be reverted. The reference link below contains a reproducer for FreeBSD. References: http://lists.open-zfs.org/pipermail/developer/2013-November/000306.html Ported-by: Ned Bass <bass6@llnl.gov> Closes openzfs#1890
This reverts commit 6a7c0cc. A proper fix for Issue openzfs#1648 was landed under Issue openzfs#1890, so this is no longer needed. Signed-off-by: Ned Bass <bass6@llnl.gov>
Under the right conditions sa_find_sizes() will compute an incorrect size of the system attribute (SA) header. This causes a failed assertion when the SA_HDR_SIZE_MATCH_LAYOUT() test returns false, and may lead to corruption of SA data. The bug presents itself when there are more than two variable-length SAs of just the right size to fit in the bonus buffer of a dnode. The existing logic fails to account for the SA header space needed to store the sizes of all the variable-length SAs. A reproducer was possible on Linux by setting the xattr=sa dataset property and storing xattrs on symbolic links (Issue openzfs#1648). Note the corrupt link target name: $ zfs set xattr=sa tank/fish $ cd /tank/fish $ ln -fs 12345678901234567 link $ setfattr -n trusted.0000000000000000000 -v 0x000000000000000000000000 -h link $ setfattr -n trusted.1111111111111111111 -v 0x000000000000000000000000 -h link $ ls -l link lrwxrwxrwx 1 root root 17 Dec 6 15:40 link -> 90123456701234567 Commit 6a7c0cc worked around this bug by forcing xattr's on symlinks to be stored in directory format. This change implements a proper fix, so the workaround can now be reverted. The reference link below contains a reproducer for FreeBSD. References: http://lists.open-zfs.org/pipermail/developer/2013-November/000306.html Ported-by: Ned Bass <bass6@llnl.gov> Closes openzfs#1890
This reverts commit 6a7c0cc. A proper fix for Issue openzfs#1648 was landed under Issue openzfs#1890, so this is no longer needed. Signed-off-by: Ned Bass <bass6@llnl.gov>
Under the right conditions sa_find_sizes() will compute an incorrect size of the system attribute (SA) header. This causes a failed assertion when the SA_HDR_SIZE_MATCH_LAYOUT() test returns false, and may lead to corruption of SA data. The bug presents itself when there are more than two variable-length SAs of just the right size to fit in the bonus buffer of a dnode. The existing logic fails to account for the SA header space needed to store the sizes of all the variable-length SAs. A reproducer was possible on Linux by setting the xattr=sa dataset property and storing xattrs on symbolic links (Issue #1648). Note the corrupt link target name: $ zfs set xattr=sa tank/fish $ cd /tank/fish $ ln -fs 12345678901234567 link $ setfattr -n trusted.0000000000000000000 -v 0x000000000000000000000000 -h link $ setfattr -n trusted.1111111111111111111 -v 0x000000000000000000000000 -h link $ ls -l link lrwxrwxrwx 1 root root 17 Dec 6 15:40 link -> 90123456701234567 Commit 6a7c0cc worked around this bug by forcing xattr's on symlinks to be stored in directory format. This change implements a proper fix, so the workaround can now be reverted. The reference link below contains a reproducer for FreeBSD. References: http://lists.open-zfs.org/pipermail/developer/2013-November/000306.html Ported-by: Ned Bass <bass6@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1890
Under the right conditions sa_find_sizes() will compute an incorrect size of the system attribute (SA) header. This causes a failed assertion when the SA_HDR_SIZE_MATCH_LAYOUT() test returns false, and may lead to corruption of SA data. The bug presents itself when there are more than two variable-length SAs of just the right size to fit in the bonus buffer of a dnode. The existing logic fails to account for the SA header space needed to store the sizes of all the variable-length SAs. A reproducer was possible on Linux by setting the xattr=sa dataset property and storing xattrs on symbolic links (Issue openzfs#1648). Note the corrupt link target name: $ zfs set xattr=sa tank/fish $ cd /tank/fish $ ln -fs 12345678901234567 link $ setfattr -n trusted.0000000000000000000 -v 0x000000000000000000000000 -h link $ setfattr -n trusted.1111111111111111111 -v 0x000000000000000000000000 -h link $ ls -l link lrwxrwxrwx 1 root root 17 Dec 6 15:40 link -> 90123456701234567 Commit 6a7c0cc worked around this bug by forcing xattr's on symlinks to be stored in directory format. This change implements a proper fix, so the workaround can now be reverted. The reference link below contains a reproducer for FreeBSD. References: http://lists.open-zfs.org/pipermail/developer/2013-November/000306.html Ported-by: Ned Bass <bass6@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes openzfs#1890
This reverts commit 6a7c0cc. A proper fix for Issue openzfs#1648 was landed under Issue openzfs#1890, so this is no longer needed. Signed-off-by: Ned Bass <bass6@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes openzfs#1648
Under the right conditions sa_find_sizes() will compute an incorrect size of the system attribute (SA) header. This causes a failed assertion when the SA_HDR_SIZE_MATCH_LAYOUT() test returns false, and may lead to corruption of SA data. The bug presents itself when there are more than two variable-length SAs of just the right size to fit in the bonus buffer of a dnode. The existing logic fails to account for the SA header space needed to store the sizes of all the variable-length SAs. A reproducer was possible on Linux by setting the xattr=sa dataset property and storing xattrs on symbolic links (Issue openzfs#1648). Note the corrupt link target name: $ zfs set xattr=sa tank/fish $ cd /tank/fish $ ln -fs 12345678901234567 link $ setfattr -n trusted.0000000000000000000 -v 0x000000000000000000000000 -h link $ setfattr -n trusted.1111111111111111111 -v 0x000000000000000000000000 -h link $ ls -l link lrwxrwxrwx 1 root root 17 Dec 6 15:40 link -> 90123456701234567 Commit 6a7c0cc worked around this bug by forcing xattr's on symlinks to be stored in directory format. This change implements a proper fix, so the workaround can now be reverted. The reference link below contains a reproducer for FreeBSD. References: http://lists.open-zfs.org/pipermail/developer/2013-November/000306.html Ported-by: Ned Bass <bass6@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes openzfs#1890
If you have xattr=sa set and create a symlink, then start adding xattr data to the symlink, it will eventually corrupt the timestamp and the link destination of the symlink. My assumption is that this happens when there is no longer enough space to hold the xattr info in the inode and it moves it to a 'dir' style. This happens to us on CentOS 6.4 with zfs 0.6.1. It's so bad that you can't unmount the zfs filesystem and the umount process ends up hitting 100% cpu. It was pretty easy to replicate, just keep adding more xattr name/value pairs.
This essentially prevents us from using xattr=sa on a glusterfs brick with any symlinks in the filesystem. Here was a test of simulating what glusterfs adds:
Notice how the symlink becomes corrupted and then the timestamp was also corrupted without doing anything else to it.
Todd
The text was updated successfully, but these errors were encountered: