-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Convert git commit summary to valid UTF8. #28356
Conversation
The summary string ends up in the database, and (at least) MySQL & PostgreSQL require valid UTF8 strings. Fixes #28178
Why validateUTF8 is better than try to convert them into UTF8? |
Because you do not know what encoding it was, then how could you "convert" it? |
We can detect it and convert it like we did on git files if the content is long enough. |
If you mean guessing the content encoding by something like chardet package, I think it's feasible while not necessary in my mind, because the git commit will be displayed by many clients in the end, many git clients also use "UTF-8" (I doubt whether there are enough clients do encoding guessing). And since the commit message (usually) is very short, not sure whether the guessing algorithm is accurate enough. It's also OK to do the encoding guessing if most people like, I am fine with either approach. |
Some references. According to git official document, using non-UTF-8 encoding is not encouraged, and Gitea server itself doesn't (and should never) use the encoding config option IMO ..... Actually git expects that the "commit message encoding" is properly stored in the commit object, then the message is still able to be displayed as UTF-8 when outputting: |
* giteaofficial/main: Convert git commit summary to valid UTF8. (go-gitea#28356) Fix RPM/Debian signature key creation (go-gitea#28352) Refactor template empty checks (go-gitea#28351)
The summary string ends up in the database, and (at least) MySQL & PostgreSQL require valid UTF8 strings. Fixes go-gitea#28178 Co-authored-by: Darrin Smart <darrin@filmlight.ltd.uk>
Backport #28356 by @darrinsmart The summary string ends up in the database, and (at least) MySQL & PostgreSQL require valid UTF8 strings. Fixes #28178 Co-authored-by: darrinsmart <darrin@djs.to> Co-authored-by: Darrin Smart <darrin@filmlight.ltd.uk>
The summary string ends up in the database, and (at least) MySQL & PostgreSQL require valid UTF8 strings. Fixes go-gitea#28178 Co-authored-by: Darrin Smart <darrin@filmlight.ltd.uk>
The summary string ends up in the database, and (at least) MySQL & PostgreSQL require valid UTF8 strings. Fixes go-gitea#28178 Co-authored-by: Darrin Smart <darrin@filmlight.ltd.uk>
The summary string ends up in the database, and (at least) MySQL & PostgreSQL require valid UTF8 strings.
Fixes #28178