Unicode: Byte Order (Endianness)

By Xah Lee. Date: . Last updated: .

Byte Order (aka Endianness. big-endian vs little-endian) indicates the order of byte unit, used in file or Binary transmission.

For example, the character ๐Ÿคก

In UTF-16 Encoding, it has 4 bytes: D83E DD21 (Each hexadecimal represents 4 binary digits. So, 2 hexadecimal digits is 8 binary digits, thus 1 byte.)

In UTF-16, the minimal number of bytes for a character is 2 bytes. So, it groups every 2-byte as one single unit, called code unit.

Origin of the jargon Big-Endian, Little-Endian

The term Big-Endian vs Little-Endian for byte-order came from a article written by Danny Cohen, published in 1980.

[ON HOLY WARS AND A PLEA FOR PEACE By Danny Cohen. At Endian_war_1980_Danny_Cohen.txt ]

it alludes to Jonathan Swift's 1726 satire Gulliver's Travels. PART I โ€” A VOYAGE TO LILLIPUT, where the people of Lilliput and Blefuscu fight about which end of egg to crack first.

Unicode and Encoding Explained

Symbols

Special

Languages

Ancient

Conlang

How To

Art

Misc

Unicode for Programers