Semantics and Symbols: Examples of Unicode Symbols Usage
This page is a misc collection of short expositions on Unicode symbols semantics and usage.
Unicode Symbol for See, See Also
Decided to adopt a new Unicode symbol usage. On my site, i have hundreds links of the form: (See: link title.)
. So, a article may have lots โ(See: โฆ)โ, sometimes a few in a paragraph. It gets annoying. I decided to replace the โ(See: โฆ)โ with just โ(โฒ โฆ)โ, and get rid of the ending period.
Note that the Unicode char
- โฒ (U+27B2: CIRCLED HEAVY WHITE RIGHTWARDS ARROW)
It displays fine by default in all major browsers. This is a important point to consider when you want to adopt some Unicode char. ใsee Unicode: Arrows โ โต โ โฒ โคใ
Originally, i thought of using a eye icon. The one that i found are:
- ๐ (U+1F440: EYES)
- ๐ (U+1F453: EYEGLASSES)
but these characters are new in Unicode 6, and they do not show in browsers except Firefox on my machine. Also, they don't really fit. The eyeglasses is OK but in normal sized font it's illegible. Then, i thought of using
โ U+261E: WHITE RIGHT POINTING INDEX
This char shows up but again not legible without larger font. This is a important point, because it ruled out many other iconographic symbols. I don't want to use HTML markup to enlarge the font, because that adds bulk and complexity and is distracting.
Then i thought of using a Egyptian eye icon.
- ๐ (U+13080: EGYPTIAN HIEROGLYPH D010) ใsee Egyptian Hieroglyph ๐ใ
It is quickly ruled out because none of them shows up.
Then i thought of just using a arrow, like this: (โ Syntax Design: Use of Unicode Matching Brackets as Specialized Delimiters), but that left paren and the arrow combo made it look like a emoticon. This is another interesting discovery. This means, certain sequence of symbols may create out-of-context side-effects that you want to avoid. (similar to the rise of ligatures)
So, in the end, i thought some other arrow might work. Thus i ends up with โฒ. This is not the final choice, i might change it to something else in the future.
Why do i change the more clear โ(See: โฆ)โ to a icon? It's not a critical change, nor a necessarily one. The change does not matter much to readers (in fact, probably reduce the meaning of the text a tiny bit. (because when you use English word โseeโ, the meaning is clear and the usage idiomatic. But with a symbolic icon, it became rather cryptic.)). But the advantage with the change is that it makes the text more systematic, and amenable for parsing. (kinda like a micro markup)
2013-08-15 addendum. I decided to use this
- โ (U+261B: BLACK RIGHT POINTING INDEX)
2014-08-14 addendum. I decided to use this
- โค (U+27A4: BLACK RIGHTWARDS ARROWHEAD)
change'd my site's โsee alsoโ unicode marker from [โ link] to [โค link]
the impetus of my change is that, on the โMotorola Xoom tabletโ Buy at amazon , the pointing finger โ isn't showing. Btw, that is traditional symbol for index, and also carries the semantic of See Also. but fell out of use as a standard printer's punctuation. (you can read about it on wikipedia)
since it's not showing, i changed to the arrow thing โค instead. The advantage is that it more intuitively indicates to modern readers as a โsee alsoโ mark. And shows in all devices, as far as i know. The shape is simpler than the pointing index finger, so shows more clearly in devices. The minor disadvantage is that it doesn't have the classic semantic of โSee Alsoโ.
Use of Unicode Subscript Digit Characters
Found a new use of Unicode subscript characters. These: {โ โ โ โ โ โ โ โ โ โ}. In many of my web articles, they are divided into several pages. For example, titled: โAlgorithmic Mathematical Art (page 1)โ, and the second page would be titled โAlgorithmic Mathematical Art (page 2)โ, etc. I find those โ(page 1)โ etc too verbose and distracting. Now i just use subscripts. Like this:
- Algorithmic Mathematical Art
- Algorithmic Mathematical Art โ
- Algorithmic Mathematical Art โ
Am not completely satisfied. I think i should add a subscript โpโ there too, like these: {โโ โโ โโ}, so it makes the semantics more unique. With just subscript digits, it's still too syntactically ambiguous, because subscript digits could appear in lots other places for different purposes. But for now, i'll let it be.
as of today, i've reverted. Using the subscript does not work in some tablet or mobile phone, and is too much in-your-face, have problem with search engines.
Unicode use: The Wave Dash ใ
This is the nth episode of Xah's rectification of typographical convention!
ใsee The Writing Style of Xah Leeใ
ใsee The Moronicities of Typography: Hyphen, Dash, Quotation Marks, Apostropheใ
Today, i decided to use the Unicode WAVE DASH ใ for date range.
- ใ (U+301C: WAVE DASH)
Traditionally, it's done by a EN DASH โ. However, it has ambiguity problems with hyphen and minus sign. This is especially important in scientific contexts. Quote from Wikipedia Dash:
The Guide for the Use of the International System of Units (SI) recommends that when a number range might be misconstrued as subtraction, the word โtoโ should be used instead of an en dash. For example, โa voltage of 50 V to 100 Vโ is preferable to using โa voltage of 50โ100 Vโ. It is also considered inappropriate to use the en dash in place of the words to or and in phrases that follow the forms from โฆ to โฆ and between โฆ and โฆ.[9][10]
The sources [9] [10] are:
- ใCopyediting: A Practical Guide By Judd, Karen. Crisp Publications. ISBN 1-56052-608-4. At Buy at amazonใ
- ใWebster's new world English grammar handbook By Gordon Loberger; Kate Shoup Welsh. New York: Hungry Minds. ISBN 0-7645-6488-9. At Buy at amazonใ
A couple years ago ~2009, i tested in browsers about displaying the wave dash. At the time, some browser doesn't show the char. But today, all major browsers do. At the time, i decided to use the FIGURE DASH โ for date range. Now, i replaced all of them on my site to the wave dash. About 480 occurrences.
For examples where date range happens a lot, see:
Incorrect Glyph in Font for Wave Dash
Note that the wave dash ใ should go up first then down like a long TILDE ~. However, there was some mixup in the character history among encodings, so that some font design got the character inverted. For example, here's Arial Unicode MS ใ. (you can see it only if you have font โArial Unicode MSโ installed.)
Use of ALMOST EQUAL TO โ
Also, i often use the TILDE ~
- ~ (U+7E: TILDE)
in front of a year to indicate approximate date or quantity, for example, {โI use Dvorak layout since ~1993โ, โPlace a ~5 ใ thick book in front of the keyboardโ}. I've been trying to find a proper symbol for that. The closest is the
- โ (U+2248: ALMOST EQUAL TO)
But am not sure about that because that symbol should between 2 quantities, as a relation, in some strict sense. Anyhow, today i decided to adopt this symbol in front of a date/number to indicate approximate/in-exact date/quantity. Not completely satisfied, but it's still better than the promiscuous tilde. For files with several use of โ, see:
Unicode Approx Equal โ vs Tilde ~
Unicode Characters for Centimeter
Discovered special Unicode chars for Centimeter.
- ใ (U+339D: SQUARE CM)
- ใ (U+33A0: SQUARE CM SQUARED)
- ใค (U+33A4: SQUARE CM CUBED)
Note: cm is used a lot in Taiwan, but not in USA. In USA, if metric length is used at all, it's more common to see millimeter (mm).
Unicode, Encoding, Escape Sequence, Issues
- Unicode Symbol for โe.g.โ (exempli gratia)
- Semantics and Symbols: Examples of Unicode Symbols Usage
- Semantic of Symbol: Unicode Ellipsis Symbol vs Dot Dot Dot
- Problems of Symbol Congestion in Computer Languages; ASCII Jam vs Unicode
- Programing Language Design: String Syntax
- Syntax Design: Use of Unicode Matching Brackets as Specialized Delimiters
- Unicode Semantics: the โ in Turn A Gundam
- URL Percent Encoding and Unicode
- URL Percent Encoding and Ampersand Char
- Semantic of Symbols: HTML Entities, Ampersand, Unicode