-
Notifications
You must be signed in to change notification settings - Fork 30
support for utf-8? #112
Comments
We do not ship any locales, similarly we should remove locale-gen from the images. Using utf-8 should be ok as long as your ssh terminal is fine with utf-8. Is there some place where utf-8 data is causing problems? |
The problem is the filename with UTF-8 characters (such as Chinese characters in this case), however right now I can use a docker container as a work around. In the CoreOS, it UTF-8 characters shown as question mark, and there is no way to cd / ls them. Thanks |
Ok, will look into it. |
Looks like it is just
So as a workaround for now you can use |
Thanks, that works. However, the CJK characters with two latin-character width may mass up the terminal sometimes (when input...) |
OK, could you give me an example of a problematic one as a test case? |
It can be reproduce by following steps:
Expected: 测test试 Maybe you'd like to actually |
If you by accident pressed a key which is not on "US keyboards", you will get strange errors. ("I mistyped Whats especially harsh is, that you cannot (obviously) generate locales or copy pre-generated locales: # scp en_US core@1.2.3.4://tmp/
$ sudo mkdir -p /usr/share/i18n/locales
mkdir: cannot create directory '/usr/share/i18n': Read-only file system
# scp UTF-8.gz core@1.2.3.4://tmp/
$ sudo mkdir -p /usr/share/i18n/charmaps/
mkdir: cannot create directory '/usr/share/i18n': Read-only file system Charmap UTF-8 is a must for everything used outside the "US" and should be the default. |
$ locale -m
locale: cannot read character map directory `/usr/share/i18n/charmaps': No such file or directory |
Yes, from the beginning we stripped everything out to keep things small. We do need to re-add things to get utf-8 working while leaving out extras like translations. Haven't gotten around to revisiting this though. |
Sorry for letting this slide. Ideally what we want is a locale that uses the UTF-8 character map but doesn't provide a translation. On some systems this is provides as https://sourceware.org/bugzilla/show_bug.cgi?id=17318 We should look into what distros that do ship the extra |
Looks like this is the locale that Debian ships: http://anonscm.debian.org/viewvc/pkg-glibc/glibc-package/trunk/debian/patches/localedata/locale-C.diff?view=log |
As distro maintainer I would go, at least for now, with (you use glibc 2.17 from 2012‽)…:
… and these environment settings (/etc/env.d/02locale)…:
… and make the corresponding folders symlinks into /var to enable users to add new charmaps. Don't forget to set |
@marineam I've reviewed the locale "C" from Debian you've linked and found that it still contains strange formats (such as for date) from the dark ages as well as redundant sections. Therefore I've created a new locale which won't have any translation, and which actually utilizes notations according to norms known by us engineers. With the exception of legible (and valid!) date/time notation, "traditional" number format (IEC wants 1234,56 — almost all programming languages have 1234.56) and a missing telephone number format, which isn't used on console anyway. You can find locale "ISO" here: https://github.com/wmark/ossdl-overlay/blob/master/sys-libs/glibc/files/0001-locale-ISO-with-international-formats.patch
$ date
Tue 2014-09-30 20:54:39 +0200
$ date +'%c'
2014-09-30 20:57:31 +0200
$ date +'%x'
2014-09-30
$ date +'%T'
20:55:33 Delimiters work as expected. |
I've created a modified CoreOS for you with the aforementioned UTF-8 support. It's on Amazon EC2: eu-west-1: ami-0522a072 I didn't address all issues I've found with CoreOS with that AMI — yet. Feel free to ping me for updated versions. @vizv Now that support for overlay is in you could mount one over |
I just found this issue when searching for 'coreos' and 'UTF-8' , so maybe someone can help me here: How can we switch locales in a default CoreOS installation ( using 681.0.0 atm)? |
Is UTF-8 support currently part of any milestone? |
Guys is there any progress on this one? Default char encoding is still I'm not sure whether you realize what consequences this might have. In JVM world these days everybody relies on the fact that their apps will run on a system with UTF-8, so developers don't specify encoding explicitly... so that any JVM app running on CoreOS would be currently totally IO wise unpredictable because all resources are I know that stuff is running inside containers, but still... |
Hi! I too am suffering from this issue, whilst attempting to deploy a meteor image that uses MongoDB. It appears MongoDB is obtaining its locale settings from the host. meteor/meteor#4019 |
For the time being, I managed to work around this. |
Mmm... it seems there is some work already there. Also, I had no problem with those chinese characters.
|
I've been trying to echo some Unicode characters with the shell lately, e.g.
So is there any workaround at the moment? I'm running CoreOS on Digital Ocean by the way. |
In case this helps anyone else: We had problems with a Java app that writes a file to a linked path of the CoreOS host. It was sufficient to add this to the FROM ubuntu Dockerfile:
(Source: http://askubuntu.com/a/601498/226557) |
@dalbani yikes. That is a different issue. Would you mind opening a new bug report? |
I recently noticed Fedora added C.UTF-8 with a pretty nice minimal patch last fall: http://pkgs.fedoraproject.org/cgit/rpms/glibc.git/commit/?h=f22&id=bfe345d460204b1c724319791a2de5be200370f0 I plan on following suit but haven't gotten to it just yet. |
|
@wmark yeah, thanks for putting that together though I'd like to follow the existing C.UTF-8 precedent. The Fedora one is pretty similar to your ISO locale with the exception that, like the built-in C locale, it uses some US specific things. |
Until marineam's patch becomes widely distributed as part of CoreOS, here's a workaround, based on japm48's work. Unlike japm48's workaround, this should be safe for a production system. Thanks, @japm48 for figuring out the locale stuff. I could not have figured that out without help.
|
Cannot set locale... (set to en_US.UTF-8).
When try to use locale-gen, it output error message:
Any idea? I'm in stable channel.
The text was updated successfully, but these errors were encountered: