Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Editor removes BOM #28743

Closed
dsteinbe11 opened this issue Jan 9, 2024 · 3 comments · Fixed by #28935
Closed

Editor removes BOM #28743

dsteinbe11 opened this issue Jan 9, 2024 · 3 comments · Fixed by #28935
Labels

Comments

@dsteinbe11
Copy link

Description

If a file is edited with the built-in editor, the BOM is removed after the commit.

The file is UTF-8-BOM, after the commit it´s UTF-8.

I have reproduced this issue with your demo site, see:
https://try.gitea.io/dsteinbe/editor-bom-issue

Gitea Version

1.21.3

Can you reproduce the bug on the Gitea demo site?

Yes

Log Gist

No response

Screenshots

No response

Git Version

2.40.0, Wire Protocol Version 2 Enabled

Operating System

windows

How are you running Gitea?

service

Database

SQLite

@crackedmind
Copy link

yeh, it's annoying

@silverwind
Copy link
Member

Maybe it's because we don't set the perserveBOM option on Monaco's getValue:

https://microsoft.github.io/monaco-editor/typedoc/interfaces/editor.ITextModel.html#getValue

silverwind added a commit to silverwind/gitea that referenced this issue Jan 26, 2024
The ToUTF8* functions were stripping BOM, while BOM is actually valid in
UTF8, so the stripping must be optional. This does:

- Add a options struct to all ToUTF8* functions, that by default will
  strip BOM to preserve existing behaviour
- Remove ToUTF8 function, it was dead code
- Rename ToUTF8WithErr to ToUTF8
- Preserve BOM in Monaco Editor

Fixes: go-gitea#28743
Related: go-gitea#6716
silverwind added a commit to silverwind/gitea that referenced this issue Jan 26, 2024
The ToUTF8* functions were stripping BOM, while BOM is actually valid in
UTF8, so the stripping must be optional. This does:

- Add a options struct to all ToUTF8* functions, that by default will
  strip BOM to preserve existing behaviour
- Remove ToUTF8 function, it was dead code
- Rename ToUTF8WithErr to ToUTF8
- Preserve BOM in Monaco Editor
- Remove a unnecessary newline in the textarea value. Browsers did
  ignore it, it seems but it's better not to rely on this behaviour.

Fixes: go-gitea#28743
Related: go-gitea#6716
silverwind added a commit that referenced this issue Jan 27, 2024
The `ToUTF8*` functions were stripping BOM, while BOM is actually valid
in UTF8, so the stripping must be optional depending on use case. This
does:

- Add a options struct to all `ToUTF8*` functions, that by default will
strip BOM to preserve existing behaviour
- Remove `ToUTF8` function, it was dead code
- Rename `ToUTF8WithErr` to `ToUTF8`
- Preserve BOM in Monaco Editor
- Remove a unnecessary newline in the textarea value. Browsers did
ignore it, it seems but it's better not to rely on this behaviour.

Fixes: #28743
Related: #6716 which seems to
have once introduced a mechanism that strips and re-adds the BOM, but
from what I can tell, this mechanism was removed at some point after
that PR.
GiteaBot pushed a commit to GiteaBot/gitea that referenced this issue Jan 27, 2024
The `ToUTF8*` functions were stripping BOM, while BOM is actually valid
in UTF8, so the stripping must be optional depending on use case. This
does:

- Add a options struct to all `ToUTF8*` functions, that by default will
strip BOM to preserve existing behaviour
- Remove `ToUTF8` function, it was dead code
- Rename `ToUTF8WithErr` to `ToUTF8`
- Preserve BOM in Monaco Editor
- Remove a unnecessary newline in the textarea value. Browsers did
ignore it, it seems but it's better not to rely on this behaviour.

Fixes: go-gitea#28743
Related: go-gitea#6716 which seems to
have once introduced a mechanism that strips and re-adds the BOM, but
from what I can tell, this mechanism was removed at some point after
that PR.
silverwind added a commit that referenced this issue Jan 27, 2024
Backport #28935 by @silverwind

The `ToUTF8*` functions were stripping BOM, while BOM is actually valid
in UTF8, so the stripping must be optional depending on use case. This
does:

- Add a options struct to all `ToUTF8*` functions, that by default will
strip BOM to preserve existing behaviour
- Remove `ToUTF8` function, it was dead code
- Rename `ToUTF8WithErr` to `ToUTF8`
- Preserve BOM in Monaco Editor
- Remove a unnecessary newline in the textarea value. Browsers did
ignore it, it seems but it's better not to rely on this behaviour.

Fixes: #28743
Related: #6716 which seems to
have once introduced a mechanism that strips and re-adds the BOM, but
from what I can tell, this mechanism was removed at some point after
that PR.

Co-authored-by: silverwind <me@silverwind.io>
henrygoodman pushed a commit to henrygoodman/gitea that referenced this issue Jan 31, 2024
The `ToUTF8*` functions were stripping BOM, while BOM is actually valid
in UTF8, so the stripping must be optional depending on use case. This
does:

- Add a options struct to all `ToUTF8*` functions, that by default will
strip BOM to preserve existing behaviour
- Remove `ToUTF8` function, it was dead code
- Rename `ToUTF8WithErr` to `ToUTF8`
- Preserve BOM in Monaco Editor
- Remove a unnecessary newline in the textarea value. Browsers did
ignore it, it seems but it's better not to rely on this behaviour.

Fixes: go-gitea#28743
Related: go-gitea#6716 which seems to
have once introduced a mechanism that strips and re-adds the BOM, but
from what I can tell, this mechanism was removed at some point after
that PR.
DennisRasey pushed a commit to DennisRasey/forgejo that referenced this issue Jan 31, 2024
Backport #28935 by @silverwind

The `ToUTF8*` functions were stripping BOM, while BOM is actually valid
in UTF8, so the stripping must be optional depending on use case. This
does:

- Add a options struct to all `ToUTF8*` functions, that by default will
strip BOM to preserve existing behaviour
- Remove `ToUTF8` function, it was dead code
- Rename `ToUTF8WithErr` to `ToUTF8`
- Preserve BOM in Monaco Editor
- Remove a unnecessary newline in the textarea value. Browsers did
ignore it, it seems but it's better not to rely on this behaviour.

Fixes: go-gitea/gitea#28743
Related: go-gitea/gitea#6716 which seems to
have once introduced a mechanism that strips and re-adds the BOM, but
from what I can tell, this mechanism was removed at some point after
that PR.

Co-authored-by: silverwind <me@silverwind.io>
(cherry picked from commit b8e6cff)
silverwind added a commit to silverwind/gitea that referenced this issue Feb 20, 2024
The `ToUTF8*` functions were stripping BOM, while BOM is actually valid
in UTF8, so the stripping must be optional depending on use case. This
does:

- Add a options struct to all `ToUTF8*` functions, that by default will
strip BOM to preserve existing behaviour
- Remove `ToUTF8` function, it was dead code
- Rename `ToUTF8WithErr` to `ToUTF8`
- Preserve BOM in Monaco Editor
- Remove a unnecessary newline in the textarea value. Browsers did
ignore it, it seems but it's better not to rely on this behaviour.

Fixes: go-gitea#28743
Related: go-gitea#6716 which seems to
have once introduced a mechanism that strips and re-adds the BOM, but
from what I can tell, this mechanism was removed at some point after
that PR.
Copy link

Automatically locked because of our CONTRIBUTING guidelines

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 29, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants