Character Frequency Counter
Plot character frequency.
You can paste in English or Spanish or computer language code or math formula (LaTeX) any text.
English Letter Frequency
Here's a sample plot.
The first 6 are: “ETHANI”.
Where Does ETAOIN Came From?
The popularly known “ETAOIN” is the frequency order from PRINTED BOOKS.
ETAOIN Does Not Represent Common Text
If you are designing a keyboard layout, then ETAOIN is not a good model to use.
ETAOIN does not contain H.
H is used in lots common words, and these words are used frequently, for example:
Vast majority of keyboard users, are likely to type just common chat text only, not printed books text, such as novels and academic text.
Using Dictionary Entry as Corpus
On linux, you can dump the list of words used by aspell, by
aspell --lang=en dump master > words
result is 119789 “words”. And the plot is:
You notice lots of “s”, due to 30823 words has a duplicate with “'s” ending. e.g. chair, chairs.
now, if you remove those words, you have 88966 words, but that still won't work because it contains all inflections of words. For example:
See also: Computer Languages Characters Frequency
2017-10-03, thanks to Ian Doug (http://iandoug.com/) for finding a source of ETAOIN.
Tip me, at patreon
Ask me question on patreon