-
Notifications
You must be signed in to change notification settings - Fork 1
Home
As of V1.7.10, Git for Windows supports Unicode. Most importantly, this means that Git repositories with non-ASCII file names can now be seamlessly shared between Git for Windows and other Git flavors (i.e. Git on Linux/Mac, Cygwin-Git and JGit / EGit).
Unfortunately, it also means that users of previous Git for Windows versions need to update their Git settings, and probably need to migrate their Git repositories, too.
- MSYS programs don't fully support Unicode yet, e.g.
- bash doesn't let you type non-ASCII characters
- ls converts non-ASCII characters to '?' when printing to the console (redirecting to a file or another program works, though)
- Tcl only supports BMP (Basic Multilingual Plane, i.e. Unicode characters \u0000 - \uffff), therefore gitk and git-gui currently don't support e.g. CJK Extensions B - D.
- TortoiseGit supports Unicode starting with version 1.7.9 ( http://code.google.com/p/tortoisegit/downloads/ ).
- GitExtensions needs to be configured to use UTF-8 ("Settings" dialog, "Global settings" tab, "Files encoding" and "GitExtensions encoding")
If you want to use a custom text editor to enter commit messages or to edit config files (instead of vim/gvim that are installed with Git for Windows), find one that supports Unix line breaks (LF only) and can save UTF-8 without BOM (i.e. Windows notepad.exe is a bad choice).
The default console font does not support Unicode. Change the console font to a TrueType font such as Lucida Console or Consolas. The setup program can do this automatically, but only for the installing user.
These can be set per user (with the --global option) or per repository, the repository settings take precedence.
By default, git will print non-ASCII file names in quoted octal notation, i.e. "\nnn\nnn...". This can be disabled with
git config [--global] core.quotepath off
Previous Git for Windows required to set the i18n.logoutputencoding to your Windows system's default OEM encoding for proper console output of non-ASCII commit messages. This is no longer necessary. Remove this or set it to 'utf-8':
git config [--global] --unset i18n.logoutputencoding
The i18n.commitencoding setting should also be removed or set to 'utf-8' to support commit messages on the command line (git commit -m "..." from cmd.exe, MSYS bash won't let you enter non-ASCII characters):
git config [--global] --unset i18n.commitencoding
If you're using git-svn, reencoding SVN file names is no longer necessary (SVN also stores file names in UTF-8):
git config [--global] --unset svn.pathnameencoding
This is only relevant if you used non-ASCII file names with non-Unicode Git for Windows versions.
Previous Git for Windows versions stored file names in the default encoding of the originating Windows system, making these repositories incompatible with other Windows language-versions and other Git versions (including Cygwin-Git and JGit / EGit on Windows).
The Unicode-enabled Git for Windows stores file names UTF-8 encoded.
The recodetree check command scans the entire history of a git repository and prints all non-ASCII file names. If the output is empty, no migration is necessary.
Note: the recodetree script doesn't work with quoted characters, disable quoted file names first: git config [--global] core.quotepath off
The simplest way to convert old repositories is by keeping an old Git for Windows version around (e.g. installed in C:\git1.7.9):
- With the old Git for Windows version: Check out a completely clean state of the working copy (so git status reports nothing, not even untracked files):
/c/git1.7.9/bin/git clean -f & /c/git1.7.9/bin/git reset --hard
- With the new Git for Windows version: git status with the new version should now report non-ASCII file names as untracked (with correct file names), and in most cases also as deleted (with mangled file names).
- Replace file names in the staging area with the current state of the working copy:
git rm -rf --cached \* & git add --all
- git status should now report all non-ASCII file names as renamed only.
- Commit the changes:
git commit -m "UTF-8 conversion"
- Repeat for every branch of interest
This requires renaming all non-ASCII file names manually.
- Check out a clean state of the working copy:
git clean -f & git reset --hard
- git status will report non-ASCII file names as untracked (mostly with mangled names).
- Fix the mangled file names in the working copy manually.
- Replace file names in the staging area with the current state of the working copy:
git rm -rf --cached \* & git add --all
- git status should now report all non-ASCII file names as renamed only.
- Commit the changes:
git commit -m "UTF-8 conversion"
- Repeat for every branch of interest
This requires iconv.exe on the path.
- Replace file names in the staging area with the transcoded names of the HEAD commit:
recodetree head
- Reset the working copy to the state of the staging area:
git clean -f & git checkout-index -af
- git status should now report all non-ASCII file names as renamed only.
- Commit the changes:
git commit -m "UTF-8 conversion"
- Repeat for every branch of interest
Git config files with non-ASCII content need to be converted to UTF-8, for example your name in %HOME%/.gitconfig, or non-ASCII file names in .gitattributes / .gitignore / .gitmodules files.
The recodetree history command can be used to convert the entire history of the repository (requires iconv.exe). Beware that rewriting history changes all the object hashes in the repository, which has quite severe implications on other users if the repository is published (see "RECOVERING FROM UPSTREAM REBASE" in git help rebase). The recodetree history script currently does not convert config files such as .gitattributes / .gitignore / .gitmodules.