Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Valid email addresses are rejected with The email address is invalid, entirely preventing the use of gitea #19852

Open
throwaway-2e119d4c79e4b162f4 opened this issue May 31, 2022 · 24 comments
Labels

Comments

@throwaway-2e119d4c79e4b162f4

Description

Valid email addresses are rejected, preventing admin user, or any other user, creation; additionally, these same valid email addresses are rejected if a temporary one is added and attempted to fix with the valid one after the fact.

Example address that is not accepted, despite being perfectly valid under (among other standards):

Without quotes, local-parts may consist of any combination of
   alphabetic characters, digits, or any of the special characters

      ! # $ % & ' * + - / = ?  ^ _ ` . { | } ~

   period (".") may also appear, but may not be used to start or end the
   local part, nor may two or more consecutive periods appear.  Stated
   differently, any ASCII graphic (printing) character other than the
   at-sign ("@"), backslash, double quote, comma, or square brackets may
   appear without quoting.  If any of that list of excluded characters
   are to appear, they must be quoted.  Forms such as

      user+mailbox@example.com

      customer/department=shipping@example.com

      $A12345@example.com

      !def!xyz%abc@example.com

      _somename@example.com

   are valid and are seen fairly regularly, but any of the characters
   listed above are permitted.  In the context of local parts,
   apostrophe ("'") and acute accent ("`") are ordinary characters, not
   quoting characters.  Some of the characters listed above are used in
   conventions about routing or other types of special handling by some
   receiving hosts.  But, since there is no way to know whether the
   remote host is using those conventions or just treating these
   characters as normal text, sending programs (and programs evaluating
   address validity) must simply accept the strings and pass them on.

Example:

_@anydomain.com

Gitea Version

v1.16.8

Can you reproduce the bug on the Gitea demo site?

Yes

Log Gist

No response

Screenshots

No response

Git Version

No response

Operating System

No response

How are you running Gitea?

not relevant

Database

MySQL

@throwaway-2e119d4c79e4b162f4
Copy link
Author

This issue has been raised before, and closed as completed, but clearly, it is not

#17029 (comment)

#17029 (comment)

I don't think it makes sense for gitea to validate emails at this point. AFAIK these emails serve two purposes:

    Send email notifications
    Associate the user with the emails in git histories

https://davidcel.is/2012/09/06/stop-validating-email.html

https://medium.com/hackernoon/the-100-correct-way-to-validate-email-addresses-7c4818f24643

The 100% correct way

Send your users an activation email. (That’s a bold full-stop for effect.)

If you are not going to send a validation email, then do not try to validate the email address; don't reject valid email addresses you haven't validated, because you cannot know they are invalid if you haven't tested their validity by verifying they are valid...

@throwaway-2e119d4c79e4b162f4
Copy link
Author

Just bumping this to ensure it was seen; github (the platform itself) marked the issue as invalid, hid it, and locked my account for two weeks as a result

@6543

This comment was marked as outdated.

@lunny
Copy link
Member

lunny commented Jun 14, 2022

I think yes, recent version has a more restricted limitation for email address than any RFC. For your example, both user+mailbox@example.com and customer/department=shipping@example.com are valid email addresses.

@wxiaoguang
Copy link
Contributor

wxiaoguang commented Jun 14, 2022

Plus since 2012 you can use international characters above U+007F, encoded as UTF-8.

https://stackoverflow.com/questions/3844431/are-email-addresses-allowed-to-contain-non-alphanumeric-characters

(just FYI, not sure whether Gitea should support it. there could be some comments or documents about supporting or not).

@6543
Copy link
Member

6543 commented Jun 14, 2022

I wold not allow all UTF-8 but more ASCI chars like _

@lunny
Copy link
Member

lunny commented Jun 14, 2022

Currently capital char is only allowed with [0-9a-zA-Z], that why $A12345@example.com, !def!xyz%abc@example.com and _somename@example.com are not considered as valid.

@42wim
Copy link
Member

42wim commented Jun 18, 2022

If made a PR here : https://gitea.com/go-chi/binding/pulls/12 which uses the stdlib net/mail (which is rfc5322 compliant) for validation

@lunny
Copy link
Member

lunny commented Jun 18, 2022

That function allow utf8 chars which was Gitea dropped.

@42wim
Copy link
Member

42wim commented Jun 19, 2022

Ok, closed PR

@zeripath
Copy link
Contributor

zeripath commented Jun 19, 2022

This was my worry when the email character restrictions were merged. There is really only one thing we should not be allowing outside of RFC5322:

  • An initial -

but even that is really not our responsibility - it's only a problem with those running sendmail commands and the way they've configured the command. It should probably be an optional setting.

I think that UTF-8 should be allowed - at least optionally - we're no longer in the 80s and whilst the majority of email addresses are still using only basic ASCII it's not really right to still be restricting in this matter. If there are issues with ambiguous characters (and there would be) we can fix the display of those (and in fact #19990 would provide the mechanism for doing this.)

@oott123
Copy link

oott123 commented Aug 1, 2022

How about user@some-domain.com? Dashes in domain names is very common.

@theAkito
Copy link

theAkito commented Dec 3, 2022

Many websites get this one wrong, because web developers are too lazy. _@mail.com is a perfectly valid e-mail address.

@zeripath
Copy link
Contributor

zeripath commented Dec 3, 2022

This feels like something that is going to go round and round in circles.

The restriction is in:

var emailRegexp = regexp.MustCompile("^[a-zA-Z0-9.!#$%&'*+-/=?^_`{|}~]*@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(?:\\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$")
// ValidateEmail check if email is a allowed address
func ValidateEmail(email string) error {
if len(email) == 0 {
return nil
}
if !emailRegexp.MatchString(email) {
return ErrEmailCharIsNotSupported{email}
}
if email[0] == '-' {
return ErrEmailInvalid{email}
}
if _, err := mail.ParseAddress(email); err != nil {
return ErrEmailInvalid{email}
}
// TODO: add an email allow/block list
return nil
}

If you want to change/relax this you will need to make a PR that changes the code here.

You will also need to consider:

  1. Sendmail - email addresses are sent as arguments to the command line - AFAIU disallowing an initial - should be all that needs.
  2. Any other mail system - could arbitrary unicode characters break things?
  3. Do we need to add ambiguous checking to the display of email addresses.

Code speaks louder than words. Make the PR.

@lunny
Copy link
Member

lunny commented Dec 7, 2022

Just a note - gitea in 📉

https://opensource.com/business/16/6/bad-practice-foss-projects-management

  • Seeing contributors as an annoyance
  • Letting people only do the grunt work
  • Not valorizing small contributions
  • Making people who are not native English speakers feel like outsiders (or flat out discriminating and eliminating their ability to use gitea in the first place, based on heuristics such as the language they speak/write:
    image
    )

#22041 was in adherence with https://github.com/go-gitea/gitea/blob/main/CONTRIBUTING.md#code-review

I would recommend anyone that gitea does not feel is a desirable user check out Gogs: https://gogs.io/ <- it's what remained when the unhealthy contributors packed up and took off

Hi, as a comparison. Gmail has a very restriction for email name(Only characters and digitals), and Github also doesn't allow unicode characters. And I also found another ref to support that https://developer.mozilla.org/en-US/docs/Web/HTML/Element/input/email#basic_validation.
Please focus on the issue itself and don't say anything else, otherwise you will be ban.

@zeripath
Copy link
Contributor

zeripath commented Dec 8, 2022

I think we should be careful here to note that git will allow these extremely weird email addresses and Gitea will just use them.

So by having this super-restrictive pattern we're not preventing weird and ambiguous email addresses from appearing in Gitea - just preventing the user from saying that one belongs to them.

Further, the potential problem of email addresses being ambiguous/confusable with another user isn't particularly a much worse issue as the Gitea will not show the email address and will show the user that they map to instead. Thus unicode ambiguity of email addresses should only affect the user who the ambiguous email address belongs to.

Next we should consider if there are potential sec issues by allowing arbitrary email addresses.

  1. Sendmail - preventing an initial - seems to be all that is needed.
  2. We don't appear to display the email in any email templates - so I don't think there's a problem there.
  3. Otherwise the only places that the email is shown is on the user's own settings page or when they try to login with this.

As far as I can see the only person affected by Gitea allowing users to register their own weird email address is the user itself. Thus apart from blocking the initial - I can't see a good reason for further restriction beyond RFC5322.

I might be missing something though - does anyone have any other ideas?

@mqudsi
Copy link
Contributor

mqudsi commented Jan 8, 2023

The myth about the no leading - restriction is, so far as I can determine, from people that aren't aware of how to interact with unix command line utilities.

Under both Linux and BSD, sendmail, like pretty much all other CLI utilities, should be invoked with a -- argument after the flags you want sendmail executed with. All arguments supplied after -- are parsed as literal payload values and are not considered flags/switches to sendmail.

This should also be the case with all the other popular sendmail alternative MTAs like postfix and the rest.

Sure, it's possible that some ancient version of a popular MTA didn't support -- and this workaround was required (though basic unix utilities have had to support this from forever ago in order for you to create or delete a file name - so it's not some really advanced wizardry) but an MTA that old should probably not be connected to the internet as it surely has bigger problems than not supporting the venerable -- syntax (cough security vulnerabilities cough).

@delvh
Copy link
Member

delvh commented Jan 8, 2023

from people that aren't aware of how to interact with unix command line utilities

That is exactly the problem: If we allow leading -, there will always be someone who somehow forgets to prepend -- to his sendmail command.
While Gitea should handle all its sendmail code correctly, that is not necessarily the case for all subsequent tools we have no influence over that for example query the email from the API.


I don't even know why you would want an email address that starts with -.
It might be allowed but the only benefits it grants are "trolling" insecure applications and annoying senders who want to send you an email.

@theAkito
Copy link

theAkito commented Jan 8, 2023

That is exactly the problem: If we allow leading -, there will always be someone who somehow forgets to prepend -- to his sendmail command.
While Gitea should handle all its sendmail code correctly, that is not necessarily the case for all subsequent tools we have no influence over that for example query the email from the API.

Non-argument. You can say that about any option & any command. If you don't know what you are doing, you can do everything the wrong way.

Exhibit A

The find command.

If you don't know, that this antiquated tool from pre-historic stone ages is using single dashes for long options, you are just as lost. Yet, this tool is (sadly) used all over the world & people deal with it.
So, if they can deal with this pre-historic piece made by a caveman, they can deal with a simple inbetween double dash.


I don't even know why you would want an email address that starts with -.
It might be allowed but the only benefits it grants are "trolling" insecure applications and annoying senders who want to send you an email.

Non-argument. Even more so, than the last one.

Exhibit A

"I don't even know why you would want a nickname like 'delvh', like why? Why not use your real name or a normal English word?" - someone could ask.

If something is not forbidden, why make it "undesired" by convention? Such actions are usually much more confusing than an unusual case within the frame of what is allowed, because then you have millions of conventions, which are basically unwritten "rules" one "must" (not really) follow.

Therefore, if it is allowed & according to the spec, it should work. Simple as that.

If someone does not use a double dash or quotes or whatever options there are & then blaming the software design for it, is pretty much like saying "well, someone could type ls --l instead of ls -l, so we must make it work with a double dash, as well" or something along those lines.

@mqudsi
Copy link
Contributor

mqudsi commented Jan 9, 2023

That is exactly the problem: If we allow leading -, there will always be someone who somehow forgets to prepend -- to his sendmail command.
While Gitea should handle all its sendmail code correctly, that is not necessarily the case for all subsequent tools we have no influence over that for example query the email from the API.

Just a reminder that this email filter doesn't really have anything to do with weird emails making their way into the database. Email addresses still get pulled in from git commits and stored in the db as-is, without going through this filter.

Regardless, "someone might not know how to handle this" isn't really Gitea's problem.

@gilbertoca
Copy link

gilbertoca commented Feb 28, 2024

@lunny @Zettat123
I'm here, after jump to 1.20.0 from 1.19.4.
Can't change anything on Edit User Account since this e-mail gilberto.bispo@ibrowser.net.br is invalid.
Does it have anything with #27457?
Our users are from smtp servers with different domains.

@lunny
Copy link
Member

lunny commented Feb 29, 2024

@lunny @Zettat123 I'm here, after jump to 1.20.0 from 1.19.4. Can't change anything on Edit User Account since this e-mail gilberto.bispo@ibrowser.net.br is invalid. Does it have anything with #27457? Our users are from smtp servers with different domains.

The email you mentioned is considered as valid at least on main branch. What's the error when you edit it?
P.S. If the email hasn't been activated, users can only login with their name but not email address.

@gilbertoca
Copy link

@lunny @Zettat123 I blurred the username field for user protection and put another e-mail. I've tested several e-mail id but with the same domain and got the error message.

Screenshot_20240229_105239

@lunny
Copy link
Member

lunny commented Feb 29, 2024

OK. I think I just tested plain user but not SMTP user. I will test your use case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests