Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove convertAsciiToUtf8() #272

Closed
orempel opened this issue Nov 29, 2023 · 4 comments
Closed

Remove convertAsciiToUtf8() #272

orempel opened this issue Nov 29, 2023 · 4 comments

Comments

@orempel
Copy link
Contributor

orempel commented Nov 29, 2023

Moin!

With the change to Arduino v2 "everything" is utf8 ready. Which means that the convertAsciiToUtf8() function is actually doing more harm that good: playing some files result in constant errors in the terminal and no updates in the web interface:

I [21103] Kommando: Nächster Titel
I [21116] info        : Closing audio file
I [21131] info        : buffers freed, free Heap: 87180 bytes
I [21131] info        : Reading file: "/Giraffenaffen/Giraffenaffen 1/03-Hätt’ ich Dich heut’ erwartet (feat. Roman Lob).mp3"
I [21185] info        : MP3Decoder has been initialized, free Heap: 59360 bytes , free stack 2948 DWORDs
N [21189] '/Giraffenaffen/Giraffenaffen 1/03-Hätt’ ich Dich heut’ erwartet (feat. Roman Lob).mp3' wird abgespielt (3 von 17)D [21193] ws[/ws][1] error(1002): Invalid UTF-8 in text frame":3,"numberOᆳ�xV��K�?Q
D [21210] ws[/ws][1] disconnect
...
D [27104] ws[/ws][2] connect
D [27199] ws[/ws][2] error(1002): Invalid UTF-8 in text frame"],"active":ᆳ�xV����?0
D [27200] ws[/ws][2] disconnect
D [32940] ws[/ws][3] connect
D [33031] ws[/ws][3] error(1002): Invalid UTF-8 in text frame"],"active":ᆳ�xV����?0
D [33031] ws[/ws][3] disconnect
D [38878] ws[/ws][4] connect
D [38971] ws[/ws][4] error(1002): Invalid UTF-8 in text frame"],"active":ᆳ�xV����?�
D [38972] ws[/ws][4] disconnect

I will create a PR for that, but just want to ask if it make sense to rip out convertFilenameToAscii() as well. That function was updated for the utf8 change and is now just copying c-strings around.

Thanks!

@tueddy
Copy link
Collaborator

tueddy commented Nov 30, 2023

convertAsciiToUtf8() is obsolete and can be removed. Copying arround was just for smooth porting to Arduino 2.
But i think removing does not fix error(1002): Invalid UTF-8 in text frame"],"active":ᆳ�xV����?0 ?

@orempel
Copy link
Contributor Author

orempel commented Nov 30, 2023

I've already checked that, removing convertAsciiToUtf8() fixes the problem.

And I've found out that the character ( ´ ) that looks like a single quote ( ' ) is actually U+2019. Which has/had its own code 0x92 in codepage windows-1252. (It seems the title name was created on a system using that codepage and/or someone who really likes utf8 codes).

U+2019 gets utf8 encoded as 0xe2 0x80 0x99. And the 0x99 (which is a 'Ö' in cp437) gets converted by convertAsciiToUtf8() into 0xc3 0x96. The result 0xe2 0x80 0xc3 0x96 in an invalid utf8 sequence, which crashes the webinterface. Q.E.D.

Thank you wikipedia for this (unwanted!) knowledge about codepages...

Back to the topic of fixing this:
There are three calls to convertAsciiToUtf8(), one of them is done inside Web_DumpSdToNvs().
That function handles restoring a sd-card backup back to the nvs. I'm currently not sure if that place needs backwards compability code or not.

My current version is this: orempel@f94377f

@tueddy
Copy link
Collaborator

tueddy commented Nov 30, 2023

LGTM!

For backward compatibility:
Old master used Arduino 1 FS, it should be ANSI/OEM-852. For current master we have Arduino 2 FS with utf-8.
We now write a UTF-8 BOM marker into backup file, older backup without BOM, see
https://forum.espuino.de/t/nvs-kaputt-und-probleme-mit-umlauten-bei-backup-txt/2202/13

For correct import from old version we should make a character conversion from OEM-852 to UTF-8, this might be the reason for the line in Web_DumpSdToNvs(). Maybe this version from Arduino 1 branch does the right job for import?

inline void convertAsciiToUtf8(String asciiString, char *utf8String) {

Another possibility would be to leave it as it is and the user has to save the backup once, e.g. with Notepad++ with new encoding

tueddy added a commit that referenced this issue Dec 4, 2023
@tueddy
Copy link
Collaborator

tueddy commented Dec 4, 2023

Problem was reproducable after uploading a file like "Ägypten.mp3" via WebUI.
After the upload, the file was displayed incorrectly and could not be played. Fixed in 3e11c45

Feel free to make another suggestion for the replacement of convertFilenameToAscii().
Thanks for contribution!

tueddy pushed a commit that referenced this issue Feb 8, 2024
This fixes second part of #272
@tueddy tueddy closed this as completed Feb 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants