-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Longname: files/directories name upto 1023 bytes #15921
Conversation
88197c7
to
ba7a10c
Compare
@@ -367,9 +367,20 @@ typedef struct { | |||
boolean_t za_normalization_conflict; | |||
uint64_t za_num_integers; | |||
uint64_t za_first_integer; /* no sign extension for <8byte ints */ | |||
char za_name[ZAP_MAXNAMELEN]; | |||
uint32_t za_name_len; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I understand, this represents total length of the field. Just couple days ago looking recently on ZAP code I've found that its iterator code can not properly report binary names, since this structure includes neither length nor number of integers for the name, unlike value. I am thinking if it would be good to fix while you are here and changing the API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@amotin Right, this represent the buffer len for za_name.
Regarding your issue, can you elaborate a bit. Is there any command or api affect by this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tuxoko There may be other examples, but I particularly hit that dump_zap() in zdb is unable to handle DDT and BRT ZAPs.
@tuxoko sorry this PR has sat on the sidelines for so long. Can you rebase on master and I can take a look? |
@tonyhutter rebased |
@tuxoko Can I may ask for another rebase and @tonyhutter for review? Get this done would be great! In any case much thanks. |
That seems to be the case. I'm unable to write a >256B filename on Fedora 40 using this PR :
So what would be the use case for 1024B filenames if it's not supported by Linux? Does it work in FreeBSD? |
I might be misunderstanding your question, but Windows supports >255B filenames, which I assume represents a large portion of clients connected to TrueNas systems. Lots of discussion on this in #13043 |
IIRC the limit might be 255 characters, including multi-byte characters, but the current/historic ZFS limit is 255 bytes, so multibyte characters will eat into that. For example if you're using two-byte characters your limit is only 127 characters (254 bytes). This is also the case on macOS, which is what caused me to notice it as I was handling a lot of files with long, multibyte names at the time that worked fine on HFS+/APFS but not on ZFS. There are a bunch of filesystems now that support either 255 characters, or have even less strict limits (so they'll support whatever the OS' limit(s) are). It's also one of those classic chicken and egg problems – if ZFS keeps a limitation because it matched one on Linux, then Linux is less likely to increase that limit because a major filesystem doesn't support it etc. IMO it's better for a filesystem to support any size that it doesn't have a good technical reason not to (due to overhead or whatever), that way it's not the part that's holding something back. Though it makes sense to have it tunable for those that need to guarantee portability between more and less limited OSes. |
@tonyhutter The kernel VFS layer itself doesn't impose a 255 limit on filename as long as the "path" length fits in 4k page.
|
@tuxoko thanks, you were right, I wasn't enabling the |
I'm pretty happy with this - if you want to rebase it I can approve it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tuxoko sorry about the slow review on this. Just a couple of minor nits then this looks good to me. If you can get this rebased I can pull it in.
This patch is preparatory work for long name feature. It changes all users of zap_attribute_t to allocate it from kmem instead of stack. It also make zap_attribute_t and zap_name_t structure variable length. Signed-off-by: Chunwei Chen <david.chen@nutanix.com>
I had a few of drive-by questions/thoughts. Note I don't consider these blockers; just thinking about the problem space. Is it worth storing the max filename length somewhere on the dir ZAP (extended attribute? somewhere on the bonus?) so that we can increase it further in the future without needing another feature flag etc? I assume we're happy leaving leaving interpretation of the bytes in the filename as out of scope (ie, normal Is there any value in a kind of "compatibility" option, so that older implementations can still use these dirs? I personally think not; it would be a lot of complexity for perhaps very little advantage. And its tricky to do well; at least, I remember FAT32 long filenames being surprisingly difficult to get right :) |
@tuxoko the updated ABI files look like they contain a bunch on unrelated changes (and they they pass). If you remove the ABI file changes, then force update the PR you should be able to download new ABI files from the "checkstyle" builder summary section with the minimal changes required. e.g. generated with the exact same version of libabigail. |
@@ -81,6 +81,7 @@ typedef enum dmu_objset_type { | |||
* All of these include the terminating NUL byte. | |||
*/ | |||
#define ZAP_MAXNAMELEN 256 | |||
#define ZAP_MAXNAMELEN_NEW 1024 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May be since we still keep the old and to match the feature name better would be to call it ZAP_MAXNAMELEN_LONG? Though it echoes the question below how much do we expect somebody to disable long names, or it is a temporary migration instrument?
*/ | ||
if (!dsl_dataset_feature_is_active(ds, SPA_FEATURE_LONGNAME) || | ||
(flags & (LOOKUP_CREATE | LOOKUP_RENAME_TARGET))) | ||
return (ERR_PTR(-ENAMETOOLONG)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May be I don't understand something, but why do you check the dataset feature is active to fail non-create/rename operations? I suppose it is not intended to happen (other than I am nut sure feature gets enabled before TXG commit), but why do we care?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if it's not active then we can just return early.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is frequent enough (outside some special synthetic) case to optimize for it, and it just complicated the logic.
if (is_nametoolong(dentry)) { | ||
return (-ENAMETOOLONG); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cosmetics, but I prefer to avoid extra braces for a single line.
module/zcommon/zfeature_common.c
Outdated
{ | ||
static const spa_feature_t longname_deps[] = { | ||
SPA_FEATURE_EXTENSIBLE_DATASET, | ||
SPA_FEATURE_ENABLED_TXG, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really care at what TXG long names were actually enabled? What do we do with this knowledge?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if there's any reason when this was added. Maybe for audit purpose. Anyway I'll remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I am not sure what exactly does the extensible dataset, so I just hope it is there for reason and not just copy-paste.
zprop_register_index(ZFS_PROP_LONGNAME, "longname", 0, PROP_INHERIT, | ||
ZFS_TYPE_FILESYSTEM, "on | off", "LONGNAME", boolean_table, | ||
sfeatures); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am somewhat split about this property and its disabled default. I may speculate that some users may not want this functionality, while same time once it become routine I don't think I want to manually enable it on each new pool created.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason it's not enabled is because it may not be suitable for all workload. For example NFS, or when you copy files between native linux fs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am somewhat split about this property and its disabled default.
I am hoping to have a couple of days after it lands to see if I can make it work on my platforms, before it is made default ...
Updated ABI files available: https://github.com/openzfs/zfs/actions/runs/11022167809?pr=15921 |
Updated with the abi files and freebsd compile fix. Hopefull we can get a freebsd run now. A couple of notable new changes, |
3eb9465
to
665afcf
Compare
Update, fixed compile error and a typo in test. |
module/zfs/zap.c
Outdated
uint64_t len = zn->zn_key_orig_numints * zn->zn_key_intlen; | ||
if (len > maxnamelen || | ||
(zn->zn_zap->zap_dnode->dn_type != DMU_OT_DIRECTORY_CONTENTS && | ||
len > ZAP_MAXNAMELEN)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be more efficient if you reorder the last two lines. You already have len
value in register, while zn->zn_zap->zap_dnode->dn_type
may require several memory accesses, while in most cases it is not needed.
Meanwhile I am not sure in general it is a business of ZAP code to care about specific dn_type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it's not elegant, but it's readily available.
module/os/freebsd/zfs/zfs_dir.c
Outdated
@@ -577,6 +581,7 @@ zfs_link_create(znode_t *dzp, const char *name, znode_t *zp, dmu_tx_t *tx, | |||
{ | |||
zfsvfs_t *zfsvfs = zp->z_zfsvfs; | |||
vnode_t *vp = ZTOV(zp); | |||
dsl_dataset_t *ds = dmu_objset_ds(zfsvfs->z_os); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This ds
assignment is not used in most cases. It could be moved inside the if
below. Same for Linux.
module/os/freebsd/zfs/zfs_znode_os.c
Outdated
@@ -1814,7 +1814,7 @@ zfs_znode_parent_and_name(znode_t *zp, znode_t **dzpp, char *buf) | |||
return (SET_ERROR(EINVAL)); | |||
|
|||
err = zap_value_search(zfsvfs->z_os, parent, zp->z_id, | |||
ZFS_DIRENT_OBJ(-1ULL), buf); | |||
ZFS_DIRENT_OBJ(-1ULL), buf, MAXNAMELEN); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks dirty to hardcode MAXNAMELEN
here, while we receive buf
as an argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it was hardcoded as well in zap_value_search before. Though in this case we can make the caller to pass buf length.
module/zfs/zap_micro.c
Outdated
|
||
norm = kmem_alloc(namelen, KM_SLEEP); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems zap_match()
is called in a loop over potentially many other entries. Allocation for might be expensive. Sure using stack previously was a cheating, but this might need some more thinking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can go back to the original one.
This patch adds the ability for zfs to support file/dir name up to 1023 bytes. This number is chosen so we can support up to 255 4-byte characters. This new feature is represented by the new feature flag feature@longname. A new dataset property "longname" is also introduced to toggle longname support for each dataset individually. This property can be disabled, even if it contains longname files. In such case, new file cannot be created with longname but existing longname files can still be looked up. Note that, to my knowledge native Linux filesystems don't support name longer than 255 bytes. So there might be programs not able to work with longname. Note that NFS server may needs to use exportfs_get_name to reconnect dentries, and the buffer being passed is limit to NAME_MAX+1 (256). So NFS may not work when longname is enabled. Signed-off-by: Chunwei Chen <david.chen@nutanix.com>
Note, FreeBSD vfs layer imposes a limit of 255 name lengh, so even though we add code to support it here, it won't actually work. Signed-off-by: Chunwei Chen <david.chen@nutanix.com>
Signed-off-by: Chunwei Chen <david.chen@nutanix.com>
@tuxoko looks good. The CI failures here all look to be unrelated. I can handle resolving the trivial conflict in the merge. |
This patch adds the ability for zfs to support file/dir name up to 1023 bytes. This number is chosen so we can support up to 255 4-byte characters. This new feature is represented by the new feature flag feature@longname. A new dataset property "longname" is also introduced to toggle longname support for each dataset individually. This property can be disabled, even if it contains longname files. In such case, new file cannot be created with longname but existing longname files can still be looked up. Note that, to my knowledge native Linux filesystems don't support name longer than 255 bytes. So there might be programs not able to work with longname. Note that NFS server may needs to use exportfs_get_name to reconnect dentries, and the buffer being passed is limit to NAME_MAX+1 (256). So NFS may not work when longname is enabled. Note, FreeBSD vfs layer imposes a limit of 255 name lengh, so even though we add code to support it here, it won't actually work. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Chunwei Chen <david.chen@nutanix.com> Closes #15921
Our code reading/writing there may not handle misaligned accesses there on platforms that may care about it. I don't see a point to complicate it to satisfy UBSan in CI. This alignment costs nothing. Fixes: openzfs#15921 Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc.
Our code reading/writing there may not handle misaligned accesses on a platforms that may care about it. I don't see a point to complicate it to satisfy UBSan in CI. This alignment costs nothing. Fixes: openzfs#15921 Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc.
Our code reading/writing there may not handle misaligned accesses on a platforms that may care about it. I don't see a point to complicate it to satisfy UBSan in CI. This alignment costs nothing. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes #15921 Closes #16606
Motivation and Context
Description
This patch adds the ability for zfs to support file/dir name up to 1023
bytes. This number is chosen so we can support up to 255 4-byte
characters. This new feature is represented by the new feature flag
feature@longname.
A new dataset property "longname" is also introduced to toggle longname
support for each dataset individually. This property can be disabled,
even if it contains longname files. In such case, new file cannot be
created with longname but existing longname files can still be looked
up.
Note that, to my knowledge native Linux filesystems don't support name
longer than 255 bytes. So there might be programs not able to work with
longname.
Note that NFS server may needs to use exportfs_get_name to reconnect
dentries, and the buffer being passed is limit to NAME_MAX+1 (256). So
NFS may not work when longname is enabled.
How Has This Been Tested?
Types of changes
Checklist:
Signed-off-by
.