Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2.10.1 can not read files with umlauts in path #265

Closed
tipuraneo opened this issue Sep 11, 2017 · 20 comments
Closed

2.10.1 can not read files with umlauts in path #265

tipuraneo opened this issue Sep 11, 2017 · 20 comments

Comments

@tipuraneo
Copy link

tipuraneo commented Sep 11, 2017

After updating from 2.9 (unsure) to 2.10.1 pngquant can not read files with umlauts in path:

C:\Program Files\pngquant>"Drag PNG here to reduce palette automatically.bat" "C:\Users\user.DOM\Zwischen\Pläne\Image.png" C:\Users\user.DOM\Zwischen\Pl├ñne\Image.png: error: cannot open C:\Users\user.DOM\Zwischen\Pl├ñne\Image.png for reading There were errors quantizing 1 file out of a total of 1 file. C:\Users\user.DOM\Zwischen\Pl├ñne\Image.png: error: cannot open C:\Users\user.DOM\Zwischen\Pl├ñne\Image.png for reading There were errors quantizing 1 file out of a total of 1 file.

Executing

C:\Program Files\pngquant>"Drag PNG here to reduce palette automatically.bat" "C:\Users\user.DOM\Zwischen\Plaene\Image.png"

works as expected.

@kornelski
Copy link
Owner

Thanks for the report. The fix is coming.

@tipuraneo
Copy link
Author

When will the fix/next version be released?

@kornelski
Copy link
Owner

kornelski commented Oct 19, 2017

So it turned out the problem was worse than I expected. I thought the bug was caused by a new argument parsing library, but it wasn't. The cause is due to a fundamental fuck-up in Microsoft's standard library.

In 2.10 I've switched from GCC/mingw compiler to Microsoft Visual Studio, naively assuming it would support Windows better than a "unix" compiler. But it does not! Microsoft's C compiler does not support Unicode paths in standard C functions. The whole path handling in Microsoft's toolchain is a terrible mess.

So Unicode paths won't work on Windows probably until I release pngquant 3.x (replacing file I/O with Rust's which uses the lovely WTF-8 encoding for Windows), which may be in a couple of months.

@jibsen
Copy link
Contributor

jibsen commented Nov 3, 2017

I think this might be an issue with Rust rather than Windows. The mingw-w64 builds I use work fine with paths containing non-ascii characters (and also include the wildcard expansion from the mingw runtime which is missing in the recent Rust builds).

Edit: Actually, a quick test program seems to suggest Rust handles filenames with non-ascii characters fine.

@kornelski
Copy link
Owner

kornelski commented Nov 3, 2017

AFAIK pure Rust code handles Unicode filenames fine. MinGW makes Unicode filenames work by adding a compatibility layer to C stdlib that makes char * paths UTF-8.

The problem here is that Windows/MSVC itself doesn't support UTF-8 in C stdlib. AFAIK MSVC's C stdlib functions are just an unworkable legacy codepage horror. It expects programs to use Windows-specific methods to handle non-ASCII paths, but pngquant doesn't have Windows-specific code.

I plan to avoid this mess by using 100% pure Rust for file I/O. Rust's stdlib handles filenames better than Microsoft's C stdlib.

@jibsen
Copy link
Contributor

jibsen commented Nov 3, 2017

After some playing around with this I tend to agree, filenames with unicode symbols not present in the ascii codepage sometimes give problems when you try to pass them to a mingw-w64 compiled program from a command prompt window.

The only problem with Rust is that it does not provide wildcard expansion on Windows (as far as I know).

@kornelski
Copy link
Owner

Oh, that's possible. In Unix world expansion is done by the shell, not the program. I didn't know mingw emulated that, too.

@jibsen
Copy link
Contributor

jibsen commented Nov 3, 2017

Mingw has some code that performs the expansion before main is called, yes. MSVC has something similar, you just have to link with setargv.obj, which comes with the compiler.

@tipuraneo
Copy link
Author

Hi guys,

I downloaded

C:\Program Files\pngquant>pngquant.exe pngquant, 2.11.2 (November 2017), by Kornel Lesinski, Greg Roelofs. Color profiles are supported via Little CMS. Using libpng 1.6.34.

on my Windows 8.1 x64
but Encoding Problem remains.

C:\Program Files\pngquant>"Drag PNG here to reduce palette automatically.bat" "C:\Pläne\+00 EG WLAN Ist.png" C:\Pl├ñne\+00 EG WLAN Ist.png: error: cannot open C:\Pl├ñne\+00 EG WLAN Ist.png for reading There were errors quantizing 1 file out of a total of 1 file.

Where can I find a compiled Windows binary I can use temporarily till the bug is fixed?

@jibsen
Copy link
Contributor

jibsen commented Nov 15, 2017

You could try the binaries from pngquant-winbuild, but depending on if the character is in the codepage of the terminal or not, it may still not work.

@tipuraneo
Copy link
Author

In my case it does work. Thx.

@Js41637
Copy link

Js41637 commented Nov 17, 2017

Not sure if it's also related to this issue but doing pngquant ./images/*.png no longer works for me. It says it couldn't open the path for reading using the latest versions. I had to go back to version 2.9

@kornelski
Copy link
Owner

kornelski commented Nov 17, 2017

Yup, wildcards is another thing that Microsoft's compiler doesn't automatically do. I'll patch over it in the next release — #273

Sorry everyone, I was completely surprised by how many things GCC did for me to make things "just work". I expected switching to Microsoft's own compiler to improve Windows support, but it turned out the opposite :(

@PhonyWelder
Copy link

If globs on Windows causes problems than maybe drop its support in favor of unicode names (Windows supports UTF-16LE but current pngquant sends UTF-8)?
Even DOS have "for" command, so basic things like "*.png" will work: https://www.computerhope.com/forhlp.htm

@kornelski
Copy link
Owner

kornelski commented Apr 25, 2018

I've fixed the globs, thery are no longer a problem.

The current problem is that fopen from Microsoft's C standard library can't handle non-ASCII characters.

@PhonyWelder
Copy link

@kornelski
Copy link
Owner

kornelski commented Apr 25, 2018

I'm very reluctant to add ugly non-portable code to work around yet another deficiency in Microsoft's C compiler.

It took Microsoft only 15 years to add some C99 support, so maybe by 2033 they will add UTF-8 support.

In the meantime my plan is to throw away all the C code, so that I don't have to ever touch MSVC again, but I'm busy with higher priority projects right now.

@sergeevabc
Copy link

Dear Kornel, is there anything a mere user without dev skills could do to expedite a resolution of this issue?

@kornelski
Copy link
Owner

kornelski commented May 3, 2019

No, sorry. I've been busy with lots of other things.

As a user, you can pipe to pngquant instead pngquant - < "漢字.png" > "漢字-converted.png" should always work.

@kornelski
Copy link
Owner

This may be fixed in 2.15.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants