By Xah Lee. Date: . Last updated: .

Plot character frequency.

Not case sensitive no space no digits no punctuation no A-Z

You can paste in English or Spanish or computer language code or math formula (LaTeX) any text.

English Letter Frequency

Here's a sample plot.

English letter frequency To Build a Fire  Jack London 50383
English letter frequency, from text To Build a Fire by Jack London. [see Character Frequency Counter]

The first 6 are: “ETHANI”.

Where Does ETAOIN Came From?

The popularly known “ETAOIN” is the frequency order from PRINTED BOOKS.

letter frequency etaoin 61173
letter frequency from printed books between year 1800 to 2000. [English Letter Frequency Counts: Mayzner Revisited or ETAOIN SRHLDCU By Peter Novig. At , accessed on 2017-10-03 ]

ETAOIN Does Not Represent Common Text

If you are designing a keyboard layout, then ETAOIN is not a good model to use.

ETAOIN does not contain H.

H is used in lots common words, and these words are used frequently, for example:

Vast majority of keyboard users, are likely to type just common chat text only, not printed books text, such as novels and academic text.

Using Dictionary Entry as Corpus

On linux, you can dump the list of words used by aspell, by

aspell --lang=en dump master > words

result is 119789 “words”. And the plot is:

English letter frequency aspell 2017 05 18
letter frequency based on aspell dictionary word list dump, each word are duplicated with inflection variations. aspell --lang=en dump master

You notice lots of “s”, due to 30823 words has a duplicate with “'s” ending. e.g. chair, chairs.

now, if you remove those words, you have 88966 words, but that still won't work because it contains all inflections of words. For example:

  1. worshiper
  2. worshipper
  3. worshiping
  4. worshipping
  5. worshiper's
  6. worshipers
  7. worshipper's
  8. worshippers
  9. worshiped
  10. worshipped

See also: Computer Languages Characters Frequency

2017-10-03, thanks to Ian Doug ( for finding a source of ETAOIN.

