HTML: Character Sets and Encoding

By Xah Lee. Date: . Last updated: .

In HTML, you can declare the Character Set for the file, like this::

<meta charset="utf-8" />

For HTML 4, use this:

<meta http-equiv="Content-Type" content="text/html;charset=utf-8">

Once you declared your character set, you can have characters from that character set in your HTML file.

UTF-8 (Unicode) contains all the world's language's characters. Here is a sample of characters from Unicode:

© é 😂

[see Unicode Search 💋 ♥ 😄]

For unicode/charset/encoding basics, see: Unicode Basics: Character Set, Encoding, UTF-8.

Character Entity

Another way to show special characters in your file is by so-called “character entity”.

[see HTML Entity List]

HTML/HTTP Charset is About Encoding, Not Character Set

HTTP's definition of charset (and the charset meta tag in HTML) is actually about character encoding.

Here's a excerpt:

rfc 2616 encoding vs char set 2019-06-07 wyz8n
rfc 2616 encoding vs char set 2019-06-07 [RFC 2616 At http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.4 ]

What's HTML4 or HTML5's Default Encoding?

By spec, there's no default encoding.

A encoding must came from one of http header, meta tag in html file. If none found, the browser must guess.

html5 whatwg char encoding 2019-06-07 p38s7
html5 whatwg char encoding 2019-06-07 [source https://html.spec.whatwg.org/multipage/parsing.html#determining-the-character-encoding]

Reference

HTML Basics

  1. HTML Basics
  2. HTML5 Tags
  3. Case Sensitivity
  4. Allowed Characters
  5. Charset and Encoding
  6. Self-Closing Tags
  7. Multiple Class Value
  8. HTML Entity List

HTML Table

  1. HTML Table Examples
  2. HTML Table, thead, tbody, tfoot
  3. HTML Table, colgroup, col
  4. Styling HTML Table with CSS
  5. CSS: 3 Columns Page Layout
  6. Pure CSS Table

Misc

  1. Viewport Meta
  2. the Root Element
  3. iframe
  4. video
  5. audio
  6. figure
  7. dl, dt, dd
  8. s strike del
  9. time
  10. meter
  11. progress
  12. q
  13. address
  14. canvas
  15. ruby
  16. HTML5 Custom Data Attribute
  17. Big Tag
  18. Image Maps
  19. Marquee, Scrolling Text
  20. How to Markup Subtitle
  21. Meta Language Tag Obsolete
  22. Browser Auto Refresh

HTML4 Frameset

  1. HTML: Split Windows; Frameset
  2. HTML Nested Frameset

File Encoding

  1. Unicode Basics: Character Set, Encoding, UTF-8, Codepoint
  2. HTML: Character Sets and Encoding
  3. Unicode in Ruby, Perl, Python, JavaScript, Java, Emacs Lisp, Mathematica
  4. Python: Unicode Tutorial 🐍
  5. Python: Convert File Encoding
  6. Python: Convert File Encoding for All Files in a Dir
  7. Perl: Unicode Tutorial 🐪
  8. Perl: Convert File Encoding
  9. Ruby: Unicode Tutorial 💎
  10. Java: Convert File Encoding
  11. Linux: Convert File Encoding with iconv
Like it? Help me by telling your friends. Or, Put $5 at patreon.

Or, Buy JavaScript in Depth

If you have a question, put $5 at patreon and message me.

Web Dev Tutorials

  1. HTML
  2. Visual CSS
  3. JS in Depth
  4. JS Object Ref
  5. DOM Scripting
  6. SVG
  7. Blog