Unicode Support in File Names: Windows, Mac, Emacs, Unison, Rsync, USB, Zip

By Xah Lee. Date: . Last updated: .

This page is a report of my experience of Unicode support in file names of tools on Windows and Mac. It is part of the article Mac and Windows File Conversion .

Chinese and Non-ASCII Chars in File Name

If you have Chinese characters in file name, or Unicode characters such as curly quotes , the file name may be messed up when you move the file between Mac OS X and Windows, because that particular application or media or file transferring protocol may not understand non-ASCII chars.

For example:

Apps That Do Not Support File Names with Chinese Characters

• Windows Vista (64 bits, SP2) zip utility does not handle Chinese. (right click, send to, Compressed (zipped) Folder) If your folder or file name has Chinese chars, Windows will complain and refuse to compress.

• Windows Console does not support Unicode well. It prints Chinese chars as gibberish. This applies to any app using Windows Console, such as cmd.exe, PowerShell, Cygwin bash.

• Unison file sync tool does not handle Chinese names. (unison version 2.27.57) [see Complexity of Software Engineering; Emacs, Unicode, Unison]

• The dired in GNU Emacs for Windows, does not handle file with Chinese names. It shows up gibberish. (GNU Emacs 23.1.1 (i386-mingw-nt6.0.6002) of 2009-07-29 on SOFT-MJASON)

• Not sure if rsync supports Chinese fully. I think when using rsync on OS X to copy files from Mac to Windows (rsync version 2.6.9 protocol version 29), it works fine, but when using rsync on Windows (thru cygwin. rsync version 3.0.4 protocol version 30), to copy files from Windows to Mac, it has problems. Here's example of its error message:

building file list ... file has vanished: "/cygdrive/c/Users/xah/Documents/kacma pixra/prenu/200403_tony_relative/????.JPG"
file has vanished: "/cygdrive/c/Users/xah/Documents/kacma pixra/prenu/200403_tony_relative/???.JPG"
file has vanished: "/cygdrive/c/Users/xah/Documents/kacma pixra/prenu/200403_tony_relative/????.JPG"

Here's sample files names that create such error:

Apps that Supports Chinese

OS X 10.5's Terminal app supports Chinese fully. (OS X 10.4.x does not. Detail: OS X 10.4.11's Terminal app, can display Chinese chars encoded with utf-8, for example, cat text_file where the text file is utf-8 encoded. However, if a file name has Chinese, it does not show up correctly when doing ls. (because file names in OS X are encoding with utf-16, because it is HFS+.) The Terminal app has a option under menu [Terminal ▸ Window Settings… ▸ Display ▸ Character Set Encoding]. However, the menu doesn't have utf-16 as a choice.)

Mac OS X 10.5's zip tool supports Chinese. (untested by me)

OS X 10.4.11, when connecting to Windows share, can transfer file with name that has Chinese characters, in both direction.

Windows Vista (sp2), when connecting to Mac share, can transfer file with name that has Chinese characters, in both direction.