-
Notifications
You must be signed in to change notification settings - Fork 240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
should protect against UTF-8 homographs #1138
Comments
Hi, Debian-devel thread starts here: https://lists.debian.org/debian-devel/2024/11/msg00250.html Debian carries a patch that relaxes checks: https://lists.debian.org/debian-devel/2024/12/msg00045.html I probably should have dropped the patch while maintaining shadow in Debian, but could not find time to follow up all the potential breakages before doing so. :-( Thanks @Zugschlus for raising the topic and working on that. I think the best way out would be dropping the carried |
Before I filed this upstream issue, I tried creating märc, Étienne and mΩrc on Centos Stream 9, Alma Linux 9 and Fedora 41 (spun up VMs on Hetzner Cloud) using their adduser variant (needed --badnames) and the creation went through like it does on Debian. Hence I filed this upstream since this is an issue that other (all?) Linux distributions suffer from. Their adduser man page symlinks to useradd, does this also apply to the binary? |
@Zugschlus Ah, OK. I think I misunderstood the the original issue. By "blindly" I thought you were referring to Debian's behavior, where you can create UTF-8 usernames without passing |
The problem that I see is that people speaking languages that need special characters or maybe even special alphabets might get used to using --badname without understanding the implication. Would you be willing at least to document that there might be issues? |
I don't believe that there are many people using special characters in their username, but a warning would be useful to help newcomers, indeed. |
How about adding a sentence there that warns about this? If it was just me, I'd just remove the --badname option. I guess it stays there for any scripts that need it. But please don't use it. Maybe we should just state this? |
https://lists.debian.org/debian-devel/2024/11/msg00271.html:
Reference please. What was the breakage mentioned there? |
Hi,
I wouldn't say blindly. As @rbalint said, only Debian does it blindly. We do require
I oppose adding more code (and thus complexity) to setuid programs. More code, more bugs.
Counter-recommendation: POSIX recommends that usernames be restricted to the portable character set for portability (and portability in setuid binaries also implies security, if I may add).
We already do that by default. But yeah, I would fully support removing the
Cheers, |
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1074306 This has led to #1103 which sadly has not received any answer until today. Since useradd is going to be even more restrictive soon, Debian's adduser will begin parsing useradd's output shortly if there is no other way to detect useradd saying "invalid user name". Upon second reading, this all sounds like I am trying to blame the failure in adduser's test suite to useradd. That was not my intention. It is, however, too bad that Linux is going to make a step backwards away from international language support. I can live with that. |
For the protocol: useradd is not setuid, at least not in Debian. |
Ahh, yeah, sorry, I didn't check. Anyway, more code, more bugs. And you need to run it as root, which makes bugs worse. And other programs which do be setuid would have to deal with those user names at some point. I vote against adding support for non-portable names. |
Ahh, it's a breakage in Debian, not upstream. Makes sense.
In my case, I haven't replied because I didn't know what to say. I've now commented there.
I think there are limits. User-facing strings being internationalized is good. |
Hi,
it looks like most Linux distributions today "blindly" allow creating user names that contain unicode code points. useradd in this case writes a correctly encoded /etc/passwd in UTF-8 encoding.
I am not sure whether this is a good thing to do before at least doing some checks regarding unicode normalization. For example, useradd will happily create two users étienne and étienne (one with the e aigu character, the other with an e aigu composed of an accent aigu and an e). In my opinion, the second account should be rejected.
Recommended reading about ways to handle this kind of stuff are the PRECIS RFCs, 8264 and 8265 (8266 is not THIS important), and the unicode TR 15 document (https://unicode.org/reports/tr15/) about normalization forms of unicode strings.
I do fully understand if you don't want to unlock this can of worms (I had the same issue with Debian's adduser), but please consider restricting the character set to disallow this otherwise, or at least writing documentation on this matter.
Greetings
Marc
The text was updated successfully, but these errors were encountered: