Emacs: Convert Chinese/Japanese Punctuations Full-Width/Half-Width
This page shows commands to convert to/from Full-Width/Half-Width characters. (全角 半角 转换)
If you type Chinese or Japanese mixed with English, then often you'll have mixed Asian/Western punctuations, and is hard to fix manually.
Example of Asian punctuations:
- 。
- U+3002: IDEOGRAPHIC FULL STOP
- ,
- U+FF0C: FULLWIDTH COMMA
- ?
- U+FF1F: FULLWIDTH QUESTION MARK
- ;
- U+FF1B: FULLWIDTH SEMICOLON
[see Unicode Full-Width Characters]
Convert English Chinese Punctuation
(defun xah-convert-english-chinese-punctuation (Begin End &optional ToDirection) "Convert punctuation from/to English/Chinese characters. When called interactively, do current line or selection. The conversion direction is automatically determined. If `universal-argument' is called, ask user for change direction. When called in lisp code, Begin End are region begin/end positions. ToDirection must be any of the following values: 「\"chinese\"」, 「\"english\"」, 「\"auto\"」. URL `http://xahlee.info/emacs/emacs/elisp_convert_chinese_punctuation.html' Version: 2012-12-10 2022-05-18" (interactive (let ($p1 $p2) (if (use-region-p) (setq $p1 (region-beginning) $p2 (region-end)) (setq $p1 (line-beginning-position) $p2 (line-end-position))) (list $p1 $p2 (if current-prefix-arg (completing-read "Change to: " '("english" "chinese") nil "REQUIRE-MATCH") "auto" )))) (let* ($containChinese ($p1 Begin) ($p2 End) ($inputStr (buffer-substring-no-properties $p1 $p2)) ($engToChinesePairs [ [". " "。"] [".\n" "。\n"] [", " ","] [",\n" ",\n"] [": " ":"] ["; " ";"] ["? " "?"] ; no space after ["! " "!"] ["& " "&"] [" (" "("] [") " ")"] ;; for inside HTML [".</" "。</"] ["?</" "?</"] [":</" ":</"] [" " " "] ] )) (setq $containChinese (seq-some (lambda (x) (string-match (aref x 1) $inputStr)) $engToChinesePairs)) (when (string-equal ToDirection "auto") (setq ToDirection (if $containChinese "english" "chinese"))) (save-restriction (narrow-to-region $p1 $p2) (mapc (lambda ($x) (progn (goto-char (point-min)) (while (search-forward (aref $x 0) nil t) (replace-match (aref $x 1))))) (cond ((string-equal ToDirection "chinese") $engToChinesePairs) ((string-equal ToDirection "english") (mapcar (lambda (x) (vector (elt x 1) (elt x 0))) $engToChinesePairs)) (t (user-error "Your 3rd argument 「%s」 isn't valid" ToDirection)))) (goto-char (point-max)))))
Remove Punctuation Trailing Redundant Spaces
Here's helpful command to remove redundant spaces after punctuation.
- In English text, the convention is to have 1 space after punctuation (sometimes 2, after the Full Stop sign).
- In Chinese text, the convention is to have no space after punctuation.
(defun xah-remove-punctuation-trailing-redundant-space (Begin End) "Remove redundant whitespace after punctuation. Works on current line or text selection. When called in emacs lisp code, the Begin End are cursor positions for region. See also `xah-convert-english-chinese-punctuation'. URL `http://xahlee.info/emacs/emacs/elisp_convert_chinese_punctuation.html' Version: 2015-08-22" (interactive (if (use-region-p) (list (region-beginning) (region-end)) (list (line-beginning-position) (line-end-position)))) (require 'xah-replace-pairs) (xah-replace-regexp-pairs-region Begin End [ ;; clean up. Remove extra space. [" +," ","] [", +" ", "] ["? +" "? "] ["! +" "! "] ["\\. +" ". "] ;; fullwidth punctuations [", +" ","] ["。 +" "。"] [": +" ":"] ["? +" "?"] ["; +" ";"] ["! +" "!"] ["、 +" "、"] ] t t))
These commands are useful for Twitter too, for saving a few character in Twitter's character limit. Because, English punctuation takes 2 char each, while Chinese version needs just one char, the space is included in the punctuation symbol.
Convert Half-Width Full-Width Characters
Emacs: Convert Full-Width/Half-Width Characters