Unicode: Surrogate Pair
What is Surrogate Pair
Surrogate Pair is a mechanism in UTF-16 Encoding to represent one single character using two Code Points.
Surrogate Pair is used to represent characters whose code point is greater than U+FFFF (decimal 65535. That is 2^16 - 1).
Surrogate Pair is a sequence of two Surrogate Code Points: A High-Surrogate code point followed by a Low-Surrogate code point.
- High-Surrogate: Codepoints in the range U+D800 to U+DBFF (1024 of them).
- Low-Surrogate: Codepoints in the range U+DC00 to U+DFFF (1024 of them).
Surrogate Code Point Range
| start | end | |
| High-Surrogate | U+D800 (decimal 55296) | U+DBFF (decimal 56319) |
| Low-Surrogate | U+DC00 (decimal 56320) | U+DFFF (decimal 57343) |
Surrogate Code Points never represent any character
A Surrogate Code Point never represent any character by itself.
Unicode and Encoding Explained
- Unicode: Character Set, Encoding, UTF-8, Code Point
- Unicode: Code Point (Char ID)
- Unicode: Character Name
- ASCII Characters
- Unicode: Basic Multilingual Plane
- Unicode: UTF-8 Encoding
- Unicode: UTF-16 Encoding
- Unicode: Surrogate Pair
- Unicode: Byte Order (Endianness)
- Unicode: BOM, Byte Order Mark
- Set Text Editor File Encoding
- Unicode: Letter Character
- Unicode: Variation Selector