-
-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Arabic text shows as question marks (?) #12094
Comments
This is going to be a problem with your database collation and character set |
@zeripath |
OK as a holding measure whilst we think about this:
AFAICS You can set the collation to a UTF-8 collation to make varchar and nvarchar essentially the same. My concern with migrating to use NVARCHAR is that we're essentially doubling the size of our data when setting the collation properly would allow better encodings to be used by the db. @lunny what are your thoughts? |
In fact I have ever thought xorm should provide an option to consider |
We are having the same issue for emoticons. They become ?? This problem cannot be reproduced on https://try.gitea.io/Luke2/Repo2/pulls/1 What kind of database is try.gitea.io using? |
@LukeOwlclaw what database are you using? If, as I suspect, you are using MySQL in which you need to change to use utf8mb4 as the default charset and use @zuhairamahdi did changing the collation to a UTF8 enabled collation solve your problem? |
I don't think it is possible to change the collation to UTF-8 as we use SQL Server 2012 not 2019 |
Googling I think you might want to try: |
@zeripath We are using SQL Server 2016. So we are facing the same problem as @zuhairamahdi. |
I tried it in an instance in my local environment and did not fix the issue |
OK, I think we need to add an option to https://gitea.com/xorm/xorm to allow us to default to nvarchar as required. Looking at Of course - make a backup before you do this. |
I did a quick test and changed the data type for comments with ALTER TABLE [comment] ALTER COLUMN [content] nvarchar(max) Everything worked fine afterwards. I'm only not sure if the change might interfere with a future database update. For us this change seems sufficient. We want to keep things like issue titles, user names and release notes in plain English. So there's little reason to encourage users to include fancy characters. In other cases using UTF-16 makes little sense like for (commit) hashes, email addresses and access tokens. If helpful for anyone, here's a T-SQL command for generating a script for converting all SELECT 'ALTER TABLE [' + TABLE_NAME + '] ALTER COLUMN [' + COLUMN_NAME + '] nvarchar(' + IIF(CHARACTER_MAXIMUM_LENGTH=-1,'max',CAST(CHARACTER_MAXIMUM_LENGTH AS varchar)) + ')' + IIF(IS_NULLABLE='NO',' NOT NULL','') FROM INFORMATION_SCHEMA.COLUMNS WHERE DATA_TYPE='varchar' |
Thanks @PaulBol hope that there is a way to do it automatically rather than going manually and inspecting the tables if there is a column that need to be changed from |
Note that the |
I have said that. I think xorm should provide a special option for MSSQL engine, err := NewEngineWithParams(driver, connStr, map[string]string{
"DEFAULT_VARCHAR": "NVARCHAR",
}) Then all new columns will be created with Of course, for existance columns, you have to convert them manually like above. |
So I've looked at using that SQL and putting it directly in to a gitea convert command - but unfortunately it would require the constraints to be dropped and readded over time. |
I have sent a PR to xorm and will send a PR to gitea after that merged. Please review https://gitea.com/xorm/xorm/pulls/1741 |
I have sent a PR #12269 to try to resolve this issue. |
@zuhairamahdi It's only for new instance or new column on exist instance. You have to convert old column manually. |
[x]
):Description
Hi,
we just noticed that when we create a new issue or pull request all the Arabic text transforms to question marks.
I think that we can fix this by changing the database from
varchar
to benvarchar
in MS SQL.Screenshots
The text was updated successfully, but these errors were encountered: