Xah Shorthand System (Abbrev Input)
notes on creating a shorthand system for typing on computer keyboard.
;; 2025-02-22 ;; frequency within 100 ("b" "but") ("d" "does") ("f" "for") ("h" "have") ("k" "know") ("l" "let") ("m" "more") ("n" "and") ("o" "of") ("r" "are") ("t" "the") ("u" "you") ("ab" "about") ("adi" "adding") ("aft" "after") ("ag" "again") ("agn" "against") ("ant" "another") ("ard" "already") ("bf" "before") ("bk" "become") ("bks" "because") ("bn" "been") ("bt" "between") ("btr" "better") ("cj" "change") ("cji" "changing") ("cn" "cannot") ("enth" "anything") ("enwhr" "anywhere") ;; HHHH------------------------------ ;; frequency within 500 ("gn" "gonna") ("otr" "other") ;; HHHH------------------------------ ;; frequency within 1k ("enm" "anymore") ("evr" "every") ("fdi" "finding") ("fnl" "finally") ("hev" "however") ("hr" "here") ("hs" "has") ("hvi" "having") ("kd" "could") ("ktr" "control") ("ktrd" "controlled") ("lk" "like") ("lks" "likes") ("lt" "little") ("luk" "look") ("mb" "maybe") ("min" "minute") ("mk" "make") ("mtr" "matter") ("nd" "need") ("oft" "often") ("ov" "over") ("ow" "always") ("pb" "problem" xah--abhook) ("ph" "perhaps") ("pls" "please") ("pp" "people") ("pt" "point") ("rd" "read") ("rl" "really") ("rlz" "realize") ("rlzs" "realizes") ("rt" "return") ("sd" "should") ("sec" "second" xah--abhook) ("sm" "some") ("sth" "something") ("sti" "sitting") ("stm" "sometime") ("stms" "sometimes") ("td" "today") ("thi" "thing") ("thm" "them") ("ths" "these") ("thx" "thanks") ("tir" "their") ("tk" "think") ("tki" "thinking") ("tm" "time") ("tn" "then") ("tos" "those") ("tot" "thought") ("tr" "there") ("ts" "this") ("tt" "that") ("ty" "they") ("udst" "understand") ("udstd" "understanding") ("ur" "your") ("ursf" "yourself") ("usl" "usually") ("w" "with") ("wa" "what") ("wc" "which") ("wd" "would") ("whr" "where") ("wk" "work") ("wki" "working") ("wm" "woman") ("wme" "women") ("wn" "when") ("wo" "without") ("wr" "were") ("wt" "want" xah--abhook) ("yrs" "years") ;; ("smt" "smart") ;; HHHH------------------------------ ;; most common phrases ("abi" "about it") ("atm" "at the moment") ("btw" "by the way") ("cnt" "can't") ("cdnt" "couldn't") ("ct" "can't") ("ddn" "did not") ("ddnt" "didn't") ("di" "does it") ("dn" "do not") ("dnt" "don't") ("dsn" "does not") ("dsnt" "doesn't") ("dunno" "don't know") ("hdu" "how do you") ("hn" "have not") ("hrr" "here are") ("hrs" "here's") ("hsn" "has not") ("hsnt" "hasn't") ("ht" "how to") ("hvnt" "haven't") ("hvt" "have to") ("hws" "how is") ("ic" "I see") ("idk" "I don't know") ("ii" "it is") ("il" "I will") ("im" "I'm") ("iow" "in other words") ("isb" "it should be") ("isnt" "isn't") ("itd" "it would") ("itl" "it will") ("itt" "is that") ("iv" "i've") ("ivt" "i have to") ("lka" "look at") ("ls" "let's") ("lss" "let's say") ("mto" "more than one") ("nw" "no way") ("oc" "of course") ("od" "one would") ("slt" "something like that") ("otoh" "on the other hand") ("pov" "point of view") ("rn" "are not") ("sdb" "should be") ("sdn" "shouldn't") ("sdnt" "shouldn't") ("sdv" "should have") ("sdvb" "should have been") ("tb" "to be") ("tis" "it is") ("tosr" "those are") ("trr" "there are") ("trs" "there is") ("tsr" "these are") ("tss" "this is") ("ttr" "that are") ("tts" "that is") ("tu" "thank you") ("twb" "that would be") ("tyr" "they are") ("uc" "you see") ("ul" "you'll") ("uv" "you've") ("wdnt" "wouldn't") ("wl" "we will") ("wnt" "won't") ("ws" "what is") ("wsnt" "wasn't") ("wtf" "what the fuck") ("wwr" "we were") ("afaik" "as far as i know") ("dfb" "difference between") ("iirc" "if i recall correctly") ("irl" "in real life") ("itfu" "in the following") ("tral" "there are a lot") ("tralo" "there are a lot of") ("tsb" "there should be") ("wrt" "with respect to") ("wtdb" "What's the difference between") ("otu" "of the") ("ot" "other than") ("imo" "in my opinion") ;; HHHH------------------------------ ;; established abbrev ("1st" "first") ("2nd" "second") ("3rd" "third") ("eg" "e.g.") ("esp" "especially") ("ex" "example") ("ez" "easy") ("foto" "photo") ("ie" "i.e.") ;; HHHH------------------------------ ;; top 2k to 3k ("stdi" "study") ("stdii" "studying") ("thms" "themselves") ;; HHHH------------------------------ ;; english, long words ("abgs" "ambiguous") ("abmns" "abomination") ("absl" "absolute") ("absll" "absolutely") ("abst" "abstract") ("absts" "abstraction") ("abtr" "arbitrary") ("abtrl" "arbitrarily") ("abves" "abbreviation") ("adr" "address" xah--abhook) ("adv" "advice") ;; more
Xah Shorthand System, Intro
A keyboard based shorthand system.
- Year 1837, We have the Pitman shorthand system . Which is based on phonetics. This is designed for speedup for hand writing.
- Year 1960, Then we have the Shavian Alphabet 𐑕, based on Pitman system. For fast hand writing. It's a phonetic alphabet. There are about 40 letters, and each represents a sound.
- About 1910, we have steno, which is based on phonetics, plus chording hardware for hand, plus part ad hoc shortcuts. 〔see Stenotype Machine〕
- Am slowly designing a modern computer keyboard based shorthand system. Single key press one at a time, not chord, and for computer keyboard.
- Similar to, the word completion on smart phone, statistics based input system.
- Look at it this way: in typing study, we have bigrams for letters, meaning the most frequently used 2 letters sequence.
- We also have bigram for words. I.e. The most frequently used 2-words sequence. And in general, n-grams.
- The mobile phone input system saves typing by phrase and sentences, that's a magnitude more than a phonetic system (such as steno).
- The first easy step, in this xah system, is for each word in dict, shorten them by dropping vowels. And, also, rectify them using phonetic spelling. e.g. for letter c, use k or s.
- For example, “word” becoms w. “Time” and “times” both became t. “Communication” becomes kmnks.
- Now, do this for the top say one thousand most frequently used words.
- in this way, we will end up with many words having the same abbrev.
- Now, look at frequency of those words, pick the most frequently used one for that abbrev.
- Now, this system, as it is, would be already very significant. we have one thousand abbrevs.
- After that, u refine it, by, looking at the top 300 most used english words. For each that doesn't already have a abbrev, add one, by devising some scheme so they r diff from the phonetic abbrev.
- Also, is to look at most frequently used phrases, of 2 to 5 words. Similar, device a scheme to abbrev them so they r differentiated from the phonetic abbrev.
General Scheme on Xah Shorthand System
some general scheme on abbrev design. this is written and edit as i go over the years working on xah abbrev system casually.
- general scheme
- use phonetic spelling, as much as possible.
- when there are several alternatives, then words ending
- ai • 𐑲 • /aɪ/ • ride
- au • 𐑬 • /aʊ/ • loud
- a • 𐑨 • /æ/ • ash
- a • 𐑩 • /ə/ • ado
- a • 𐑪 • /ɒ/ • on
- a • 𐑭 • /ɑː/ • ah
- a • 𐑳 • /ʌ/ • up
- b • 𐑚 • /b/ • bib
- c • 𐑗 • /tʃ/ • church
- d → past tense suffix. e.g. walked
- d • 𐑛 • /d/ • dead
- e • 𐑧 • /ɛ/ • etch
- e • 𐑱 • /eɪ/ • age
- f • 𐑓 • /f/ • fee
- g • 𐑜 • /ɡ/ • gag
- h? • 𐑣 • /h/ • haha
- h • 𐑙 • /ŋ/ • hung
- i → ing
- i • 𐑦 • /ɪ/ • if
- i • 𐑰 • /iː/ • eat
- j? • 𐑠 • /ʒ/ • measure
- j? • 𐑡 • /dʒ/ • judge
- k • 𐑒 • /k/ • kick
- l → ly
- l • 𐑤 • /l/ • loll
- m → ment
- m • 𐑥 • /m/ • mime
- n • 𐑯 • /n/ • nun
- o • 𐑴 • /oʊ/ • oak
- o • 𐑶 • /ɔɪ/ • oil
- o • 𐑷 • /ɔː/ • awe
- p • 𐑐 • /p/ • peep
- q → ?
- r →
- r • 𐑮 • /ɹ/ • roar
- s → suffix tion
- s? • 𐑖 • /ʃ/ • sure
- sv → suffix sive
- s • 𐑕 • /s/ • so
- th? • 𐑞 • /ð/ • they
- th • 𐑔 • /θ/ • thigh
- t • 𐑑 • /t/ • tot
- u → suffix for phrase.
- v • 𐑝 • /v/ • vow
- w • 𐑢 • /w/ • woe
- w • 𐑫 • /ʊ/ • wool
- w • 𐑵 • /uː/ • ooze
- x → ex, eks
- y • 𐑘 • /j/ • yea
- z • 𐑟 • /z/ • zoo
- ing → i
- tion → s
- sive → sv
- ment → m
- ly → l
- ble → b
- ed → d
- church → c
- judge → j
- thing → th
- unused letters a e i o u
- questionable letters x g q
Problem with Using Phonetics
English has 44 phonemes. For example,
Consonant Phonemes:
- /p/ - as in "pat"
- /b/ - as in "bat"
- /t/ - as in "tap"
- /d/ - as in "dog"
- /k/ - as in "cat"
Vowel Phonemes:
- /i/ - as in "seat"
- /ɪ/ - as in "sit"
- /e/ - as in "set"
- /ɛ/ - as in "let"
if we gonna use phonetic alphabet as a abbrev system, that means, we need to 3 keys for each words like
- bat
- boat
- boot
which basically means very little saving of keystrokes.
in some cases, more keystrokes, because we need to distinguish vowels.
so, this means, a pure phonemes based abbrev system won't work.
Xah Talk Show 2022-02-02
Keybinding and Input-System
- Design of Keybinding, Key Shortcut, Input System
- Why Alt Tab is Bad for Switching Windows
- How Many Shortcuts Are There
- Emacs vs vim, Keybinding Efficiency
- vim HJKL vs IJKL
- Gamers WASD Keys
- Design of vim Mode Activation Key
- History of Key Shortcuts: Emacs, vim, WASD
- History of vi Keys
- Muscle Memory vs Mnemonics
- Ctrl ❌
Ban Key Combos - Ban Shift Key
- Function Key vs Key Chord
- Fast-repeat vs Non-fast-repeat Keys
- Modifier Keys Usage Frequency
Keyboard Shortcut vs Launch Buttons
- Best Way to Insert Parenthesis/Brackets
Math Input Design
Create Math APL Keyboard Layout
- Linux Mac Windows, Which is Best for Keybinding?
- Dual-Function Keys and Home Row Mods
- Xah Shorthand System (Abbrev Input)