-
Notifications
You must be signed in to change notification settings - Fork 223
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Unicode characters render improperly on Windows #52
Comments
This might be a duplicate of #22 |
It may have the same or a similar root cause. It is likely in GraphViz or in Python, but it could be due to the filetypes or encoding of the input files being different. It's related, but larger. There is a character in there git didn't print (in my original post), it should also say that |
See #22 for (most likely) a specific instance of this problem. Needs help from someone using Windows. |
You can also see it in some of my commits: I will do a check while working on #17 to see if the dev version of graphviz fixes it, and if not, I may have to break out a debugger to debug the way python accesses the disk, so I can look for the root cause |
On windows I get this error message if i use mm2 instead of AWG gauge: Workaround: replace all |
Part of the workaround in this is to force all encodings to UTF-8, and to attempt to detect the encodes. |
I wonder, what is needed to "force all encodings to UTF-8" and why is this needed in Windows? |
@kvid Python on windows writes the files as CP-2512, as that is the default extended encoding. Since the SVG is supposed to be UTF-8 (as that is the encoding graphviz generates), we need to force it to actually write as UTF-8. Sometimes, since UTF-8 and CP-2512 are somewhat compatible, python mis-identifies one as the other. by simply standardizing on UTF-8, since that's what Linux and macOS use, and thusly forcing the encoding to UTF-8 (since all modern versions of windows can easily support it), we should be able to get around the issue. A possible more elegant solution would be to simply always read and write in bytes, or in base64 encoding, but that seems more trouble than it is worth. Ideally, some smart-encoding stuff should be implemented on the YAML side to try to detect which encoding it is, since sometimes Notepad on windows will default to CP-2512 instead of UTF-8, depending on the build of Windows 10. Why does all of this happen? because IBM/DOS, and compatibility. |
Has this been solved by adding |
This appears to be fixed by forcing the encoding to UTF-8, and including the meta encoding tag inside of the generated HTML output. Closing issue. |
Windows renders unicode characters improperly, and tends to render
improperly as well. There may be a proper solution for this, but I don't know the root cause, or how to force unicode encoding.The text was updated successfully, but these errors were encountered: