Annoying Invisible ZERO WIDTH NO-BREAK SPACE Character from Google Plus, Twitter

By Xah Lee. Date: . Last updated: .

These days, when copying text from Google Plus or Twitter, often you'll get a invisible ZERO WIDTH NO-BREAK SPACE (aka BYTE ORDER MARK) (Unicode #65279). If you write blogs, that's really annoying. It taints your blog. When in the future, when you apply regex to systematically process your site, it may silently fail due to invisible character. Also, more common is the NO-BREAK SPACE (Unicode #160).

So, i use these emacs lisp code to solve the problem:

Emacs Lisp: Replace Invisible Unicode Chars

You can write a {Perl, Python, Ruby, Bash} script to solve the problem. See:

see also Unicode BOM Byte Order Mark Hack