Character Frequency Counter

By Xah Lee. Date: . Last updated: .

Plot character frequency.

Not case sensitive no space no digits no punctuation no A-Z

You can paste in English or Spanish or computer language code or math formula (LaTeX) any text.


English Letter Frequency

Here's a sample plot.

English letter frequency To Build a Fire  Jack London 50383
English letter frequency, from text To Build a Fire by Jack London. 〔►see Character Frequency Counter

The first 6 are: “E T H A N I”.

What Happened to E T A O I N?

The popularly known “E T A O I N” is not correct. It is not the order of letter frequency of average text, chat text, novel text, or journalism text.

“E T A O I N” is the arrangement of letters of type-casting machines, and may be based on analyzing a particular list of dictionary entry words.

On linux, you can dump the list of words used by aspell, by

aspell --lang=en dump master > words

result is 119789 “words”. And the plot is:

English letter frequency aspell 2017 05 18
letter frequency based on aspell dictionary word list dump, each word are duplicated with inflection variations. aspell --lang=en dump master

You notice lots of “s”, due to 30823 words has a duplicate with “'s” ending.

now, if you remove those words, you have 88966 words, but that still won't work because it contains all inflections of words. For example:

  1. worshiper
  2. worshipper
  3. worshiping
  4. worshipping
  5. worshiper's
  6. worshipers
  7. worshipper's
  8. worshippers
  9. worshiped
  10. worshipped

See also: Computer Languages Characters Frequency