Pinyin Letter Frequency 拼音字母頻率

By Xah Lee. Date: . Last updated: .
chinese pinyin letter frequency 20285
Pinyin letter frequency chart. [see Character Frequency Plot]

The text used is Chinese translation of “The Masque of the Red Death” by Edgar Allan Poe.

[see The Masque of the Red Death]

[see 紅死病的面具 The Masque of the Red Death By Edgar Allan Poe]

Here is the first paragraph, in Chinese character and in pinyin.

話說“紅死”在國內肆虐已久,象這般致命,這般可怕的瘟疫委實未曾有過。這病的具体表現和特征就是出血——一片殷紅,令人發指。患者初時感到劇痛,突然一陣頭昏眼花,于是全身毛孔大量出血喪命。只要患者的身上,特別是臉上一出現猩紅色斑點就是染上這瘟疫的預兆,這時諸親好友誰也不敢近身去救護他和慰問他。患者從得病到發病,一直到送命,還不消半小時工夫。

hua shuo “ hong si ” zai guo nei si nue yi jiu , xiang zhe ban zhi ming , zhe ban ke pa de wen yi wei shi wei ceng you guo 。 zhe bing de ju ti biao xian he te zheng jiu shi chu xie —— yi pian yin hong , ling ren fa zhi 。 huan zhe chu shi gan dao ju tong , tu ran yi zhen tou hun yan hua , yu shi quan shen mao kong da liang chu xie sang ming 。 zhi yao huan zhe de shen shang , te bie shi lian shang yi chu xian xing hong se ban dian jiu shi ran shang zhe wen yi de yu zhao , zhe shi zhu qin hao you shui ye bu gan jin shen qu jiu hu ta he wei wen ta 。 huan zhe cong de bing dao fa bing , yi zhi dao song ming , huan bu xiao ban xiao shi gong fu 。

full text masque_of_red_death_chinese_pinyin.txt

The Chinese character to pinyin is done by https://github.com/lxneng/xpinyin

Pinyin and Keyboard Layout

Here we try to find out which keyboard layout is best for input Chinese with pinyin input method.

1 2 3 4 5 6 7 8 9 0 a n i h d y u j g c v p m l s r x o ; k f . , b t / w e q \ [ ] ' - = z ` QWERTY Layout
Pinyin heatmap on QWERTY layout
! @ # $ % ^ & * ( ) 1 2 3 4 5 6 7 8 9 0 a b c d e f g h i j k l m n o p q r s t u v w x y z , . ' \ | / + = _ - { } [ ] ; ~ ` Dvorak Layout
Pinyin heatmap on Dvorak layout
1 2 3 4 5 6 7 8 9 0 a k u h s j l n d c v ; m i r p x y o e t . , b g / w f q \ [ ] ' - = z ` Colemak layout
Pinyin heatmap on Colemak layout

[see Dvorak Keyboard Layout]

Pinyin Letter Frequency Problem, the Removal of V

There is a interesting issue about v and ü in Chinese pinyin. In pinyin, the letter v is not used, but you have ü. However, for pinyin input system, you have a hack of typing v for ü, because otherwise ü is hard to type.

on Microsoft Windows's pinyin input, u also do ü. But not on MacOS.

So, now there is a interesting question when you compile statistics of pinyin letter frequency. Given a piece of Chinese text, you can translate them into pinyin, then compute the letter frequency. In this way, you'll see zero use of v. However, this is not a proper stat for the purpose of keyboard layout, because, people do type v, while your stat no use of the key v.

To fix it, one needs to convert ü to v, then, compute the statistics. But this may not be readily done, because in order to do that, the software that convert chinese into pinyin will need to include tones to create ü.

But, this “error” isn't too bad. Because the character ü in pinyin does not occur frequently. I think mostly it's only used for the chars 女 綠.

See also:

International Layouts


Western Europe Layouts



Chinese and Japanese