Converting Email Formats

By Xah Lee. Date:

I have about over 15 years worth of email in my current Apple Mail on my Mac. The earliest there dates to 1996. On average, about every 2 or 3 years, i go thru email conversion. It may be from one email application to another, or export/import from a app's upgrade. In the late 1990s, it was several upgrade with Eudora, and Mac version of Outlook Express. In early 2000s, it was from Outlook Express to Apple Mail, and followed by several version upgrades of Apple Mail that came with new versions of OS X in its fickle beginnings. In these years, i've also used quite a variety of email apps in {Mac, Windows, Unixes}, such as various variants in emacs (rmail, vm, gnus), and several variants of unixes (mail, pine, mutt), and also have used the email app in Netscape Communicator, and during 2002 to 2004, used Windows's Outlook Express at work. Have also for some period tried to use Opera Browser's mail, as well as Mozilla Thunderbird. So, all in all, i've had quite some experience of mail conversions.

Generally speaking, you always lose some data. The conversion may be from a upgrade of the same app, or between different apps, or between different OSes. If the apps in question are typical upgrade/conversion path, popular commercial competing ones, or on the same OS (For example, Eudora, Mac Outlook Express, Netscape Communicator, in late 1990s era, or Apple Mail upgrades in early 2000s), there may be built-in menu command for importing your mail. Otherwise you are out of luck, and have to do manual tech geeking with possibly few day's worth of probing and text processing. After the import, some foreign characters are likely to be faaked up, some header will be faaked such as date or subject, the read/unreaded will be gone, formatting of lines can be faaked up, plain-text vs html vs rich can be faaked, thread info if it exist will certainly be gone, rating/priority/marking or such features will be gone, attachments can be lost.

Although, the most important part, the text themselfs, will largely be preserved.

There are many many reasons of these conversion faaking problems. First of all, it's due to there being no one standardized and robust email format. There is this mbox plain text format, invented by the motherfaaking unixers. Almost no 2 email apps that uses seemingly same mbox format are actually the same, and some, such as Mac Microsoft Outlook, uses a database system (which is very good!). Then, it's due to non-precise and badly designed RFC mail protocols invented by the motherfaaking unixers. (RFC = Really Faaking Common.) Then, there's several formats for rich text email (i.e. Enriched text, HTML e-mail, MIME.), none of which is near precise. (The Apple motherfaaking fanatics, in early 2000s, tried to push its Enriched text format as the standard instead of HTML mail) Then, there's also EOL char disagreement among Unix, Mac, Windows. Then, there's transmission encoding issue, since the unixer faakfaces's mail protocol only does 7-bits ascii. Then, there's the char set issues (For example, for different languages), especially in the 1990s or early 2000s, resulting incorrectly encoded charset or incorrectly tagged charset label. Then, each email app may have its own features, such as various markings (For example, read/unread, replied/unreplied, mail priority), and there's features of link to reply, threading, etc.

See also: Unix and the mbox Email Format, Unix, RFC, Line Truncation

As to what i'm going to do for my Apple Mail email archive… i'm not sure yet. Apple Mail uses “.emlx”, while Windows Mail uses “.eml”, both are variants of one file per mail plain text format. I could spend a few hours to marshal my Apple Mails and put them in a dir structure and coax windows mail to import or see it. It will probably take a day, and the result import will probably not be that great. Or, i could just leave my Apple Mail in plain text as a historical archive, to be searched by grep. We'll see.