Unicode: Codepoint
What is Codepoint
Each Unicode character is given a unique ID. This id is a number (integer), starting at 0, called the char's codepoint.
๐ก TIP: Better name is just Character ID.
Codepoint is represented either in decimal or Hexadecimal.
char | name | codepoint | codepoint in Hexadecimal | UTF-8 Encoding | UTF-16 Encoding |
---|---|---|---|---|---|
a | LATIN SMALL LETTER A | 97 | 61 | 61 | 61 |
ฮฑ | GREEK SMALL LETTER ALPHA | 945 | 3b1 | CE B1 | 03 B1 |
๐ | FACE WITH TEARS OF JOY | 128514 | 1f602 | F0 9F 98 82 | D8 3D DE 02 |
Standard Notation for Codepoint
The standard notation for codepoint is โU+โ followed by its codepoint in Hexadecimal. e.g.
U+3B1
How to Find a Character's Codepoint
- Paste the character in Unicode Search ๐
How to Find a Character, Given Its Codepoint
- Paste the character's codepoint in Unicode Search ๐
Unicode and Encoding Explained
- Unicode: Character Set, Encoding, UTF-8, Codepoint
- Unicode: Codepoint
- Unicode: Character Name
- ASCII Characters
- Unicode: UTF-8 Encoding
- Unicode: UTF-16 Encoding
- Unicode: Surrogate Pair
- Unicode: Byte Order (Endianness)
- Unicode: BOM, Byte Order Mark
- Set Text Editor File Encoding
- Unicode Letter Character
- Unicode: Variation Selector