Mac & Windows File Conversion

, , …,

This article discuss issues about moving files between Mac & Windows. For example, Mac files may have Resource Fork, or file creator/type info. When moved to Windows, these info are gone, and for some files they will not function without these info. Also, Mac & Windows support different character set in file names. So, when you transfer files between Mac & Windows, the file names may be messed up.

Mac OS Resource Folk and Type Code

Resource Fork

In the 1990s, before the days of OS X, Mac OS files heavily relies on Resource fork. With OS X, it is decided in the early 2000s that resource fork is to be deprecated.

Vast majority of Mac apps today do not create files with resource fork. However, Mac applications (those in /Applications/ folder, may still rely on resource fork to function.

Warning: You cannot simply delete the resource fork of a file and expect the file to function, because for some files, such as “unflattened” QuickTime movie files, the main data is in resource fork.

For Perl Scripts that tell you which files in a dir have resource fork, and other scripts for preparing transfer of Mac files to Windows, See: Perl Scripts For Mac/Windows File Moving.

For tech detail on resource fork, type/creator code, and OS X command tools for them, see: Mac OS X Resource Fork and Command Line Tips.

File Type and Creator Type

Mac files have a 4-letter Type code, to indicate file type and creator type. (Its purpose is similar to Filename extension and Internet media type (aka MIME type)) This type code is not in the resource fork. It is a feature of Mac's file system the HFS and HFS+.

File type/creator code are largely deprecated in preference to file name extensions. However, it is still common to see Mac OS X apps creating them. It is pretty safe to delete the type/creator code, if the file has file name extensions.

When you move your files to Windows, the type/creator code is automatically gone because Windows filesystem does not support it.

Mac “Icon^M” files

Another Mac specific file are those files named as “Icon^M”, where the “^M” is the Return character (ASCII 13). These are folder icon files. I cannot find info on the web about them. I don't know if they are Apple Icon Image format.

You can still find these “Icon^M” file names in OS X. For example, you'll find it in 〔/Applications/Adobe Reader 8/〕, as well in StuffIt 10, Mac Pov-Ray 3.6, Adium (v 1.3.7), and if you use Jamie Zawinski's XScreenSaver for OS X, you'll find many “Icon^M” files in your ~/Library/ dir. (I think the “Icon^M”, at least the filename itself, is deprecated.)

For Perl Script that find these files or remove them, see: Perl Scripts For Mac/Windows File Moving.

Not Allowed Chars in File Name

See: What Characters Are Not Allowed in File Names on Mac & Windows?

Chinese and Non-ASCII Chars in File Name

See: Unicode Support in File Names: Windows, Mac, Emacs, Unison, Rsync, USB, Zip.

File Name and File Path Length

Today, both Windows and Mac allow filenames to be a max of 255 chars. However, i'm not sure what's the max length for dir path. At least, i know that as of 2000, the unix gnu tar util will f���up if you have path that's longer than about 120 chars. So, if you are using tar to archive a directory, be careful if the file paths is long. (for detail, see bottom of: Unix, RFC, Line Truncation.)

Removing “.DS_Store” and “Thumbs.db” files

Mac creates .DS_Store file in each folder. You'd want to remove them if you are copying them to Windows. You can run the following command in bash to remove them:

find . -name ".DS_Store"
find . -name ".DS_Store" -exec rm {} \; 

Note, Windows creates Thumbs.db. Windows Vista no longer produces that, but when accessing non-Windows networked files, it does create that in the dir.

For Perl Scripts, See: Perl Scripts For Mac/Windows File Moving.

File Transferring Method

There are many file transfer method and tools. You can copy it thru USB flash drive, or you can use the built-in file sharing on Mac OS X's Finder or Windows's Explorer, or you can use unix tools such as rsync, unison. Also, you might compress it using zip or tar gz, before using any of the above transferring method.

The issue of preserving filenames with non-ASCII chars, unicode chars, or filename length, depends on the transfer method and the compression tool.

When shared thru Windows file sharing , it is done thru SMB/CIFS protocol. Am not sure how that protocol handles file name transfer, but i do know, that as late as 2006, the open source Samba software as part of Mac OS X for sharing Windows files, will f���up Chinese chars in filenames.

The Unison file syncing utility also doesn't understand Unicode as of 2009-06. See: Complexity of Software Engineering; Emacs, Unicode, Unison.

Another method to transfer file is to zip it first, then pass it thru network. ZIP itself has problems with non-ASCII characters. (i.e. as far as i know there are few variants of zip, some don't handle Unicode well. As late as OS X 10.4.x, when you have a downloaded zip file containing chinese names, and you unzip thru Finder's BOMArchiveHelper, the Chinese names will become gibberish. The solution is to use The Unarchiver, which is used in OS X 10.5.x to replace BOMArchiveHelper.)

Another way is using tar or tar gz instead of zip. (similarly, you can use any file compression scheme) Tar had problems with file full path length back in 2000. Am not sure how it deals with unusual chars, Unicode chars, or file lengths today.

Another common way today is to copy it to a USB flash drive, then copy it to another machine. How well this method preserves the file integrity depends on the file system used on the USB drive. Most USB drives are pre-formatted with FAT32 file system, which is a old file system. I think in practice there are variations and parameter differences that effects file transfer by it. Also see: Long filename.

Whatever is your method or tool for file transfer, if you want to preserve really long file names or paths, or Unicode chars, Chinese chars, your should test it out first. Also, the result may be different depending on whether you are moving from Mac to Windows or Windows to Mac.

Another issue is whether file date, owner/group, permissions, etc, are preserved. This is important when you are dealing with software data. This again depends on the tool. In general, my experience is that, expect these data to be lost.

See also:

Misc notes:

Mac version of AOL Instant Messenger saves its chat log's file names like this:

xahlee’s Logs
 IM “Gesutus”-2003-12.html

node60091’ Logs
 IM “rogerhoward@mac…”-2004.02.html

Note the curly single quote, curly double quotes, and the ellipsis char.

rsync in cygwin used from Windows Vista, when used to sync files from Windows to Mac, and if the file name on the Windows contains single curly quotes or double curly quotes, rsync will chock.

It'll also chock if the file name contains ellipsis char “…”. Here's a sample error:

xah@xah-PC ~
$ Documents/scripts/sync_pc_mac.sh
Password:
building file list ... done
rsync: recv_generator: failed to stat "/Users/xah/Documents/tavla_vreji/AIM logs
 2005/node60091's Logs/IM [rogerhoward@mac.]-2004.02.html": Invalid argument (22
)
xah@xah-PC ~
$ rsync --version
rsync  version 3.0.4  protocol version 30
Copyright (C) 1996-2008 by Andrew Tridgell, Wayne Davison, and others.
Web site: http://rsync.samba.org/
Capabilities:
    64-bit files, 64-bit inums, 32-bit timestamps, 64-bit long ints,
    socketpairs, hardlinks, symlinks, no IPv6, batchfiles, inplace,
    append, ACLs, no xattrs, iconv, symtimes

rsync comes with ABSOLUTELY NO WARRANTY.  This is free software, and you
are welcome to redistribute it under certain conditions.  See the GNU
General Public Licence for details.
blog comments powered by Disqus