This page explains some tech detail about how Mathematica uses Unicode.
Mathematica supports Unicode, but does not use Unicode when saving to file. (See: UNICODE Basics: What's Character Encoding, UTF-8, and All That?)
Mathematica (Ɱ) files use 7-bits ASCII only. 〔☛ Mathematica Notebook Technology〕
How does it support Unicode if it uses only ASCII?
Ɱ has a set of special characters with the syntax
\[name]. For example:
So, when you type
\[Alpha], it is displayed as “α”.
(All built-in symbols in Ɱ starts with capital letter.)
You can think of them as html's “named character entities”. 〔☛ Character Sets and Encoding in HTML〕 There are about 900 named chars. For the complete list, see: Listing of Named Characters.
Many of the named chars are also in Unicode, but not all. Similarly, many Math Symbols in Unicode are not in this list. Also, Unicode's Chinese chars, Arabic alphabets etc, are not in Ɱ's named chars.
When you paste a Unicode char into Ɱ, Ɱ will try to interpret the Unicode as one of the named char.
So, for example, if you paste “α” (GREEK SMALL LETTER ALPHA; “U+x3b1”), it automatically becomes Mathematica's
\[Alpha], and displayed as “α”.
For any Unicode that's not one of Ɱ's named char (such as Chinese chars), their syntax is this:
\:nnnn, where the nnnn is Unicode's 4 digit hexidecimal representation of the char. For example, the Chinese char “水” (water), Unicode hex is “6c34”, in Ɱ is:
The above roughly summarize how Ɱ takes Unicode as input.
Of the named chars, many has special meaning in Ɱ. For example,
\[Pi] is automatically considered identical to the built-in symbol
Pi, which means the mathematical constant. (So, if you type
N[\:03c0], they are displayed as
N[π] with meaning of
N[Pi], and if you evaluate it, you get “3.14159”.). Here's some examples of special meaning named chars.
|Glyph||Mathematica's name||Unicode name||Unicode hexidecimal||Default Interpretation|
|≥||\[GreaterEqual]||GREATER-THAN OR EQUAL TO||2265||GreaterThan |
|π||\[Pi]||GREEK SMALL LETTER PI||03c0||Pi|
Note: it appears that it is possible to over-ride the default interpretation of named char to built-in symbol (function, constant), for all or some of the named char. (i haven't investigated on how yet.) See: MakeExpression.
Some of the named char has one or more aliases for ease of input. For example, to enter α, you can type 【EscaEsc】 or 【EscalphaEsc】. Here's some examples:
See: http://reference.wolfram.com/mathematica/tutorial/Introduction-ListingOfNamedCharacters.html. Quote:
- Characters that are alternatives to standard keyboard operators use these operators as their aliases (⁖
- Most single-letter aliases stand for Greek letters.
- Capital-letter characters have aliases beginning with capital letters.
- When there is ambiguity in the assignment of aliases, a space is inserted at the beginning of the alias for the less common character (⁖
Esc->Escfor \[Rule] and
Esc ->Escfor \[RightArrow]).
- ! is inserted at the beginning of the alias for a Not character.
- TeX aliases begin with a backslash \.
- SGML aliases begin with an ampersand &.
- User-defined aliases conventionally begin with a dot or comma.
See: Special Characters.
… work in progress
You can input a special character by:
… work in progress
See also: Wikipedia:LaTeX symbols