How to convert a file encoding in Python?
# -*- coding: utf-8 -*- # python path='infile.html' path2='outfile.html' f= open(path, 'rb') content= unicode(f.read(), 'gb18030') f.close() f= open(path2, 'wb') f.write(content.encode('utf-8')) f.close()
(thanks to Andrew Clover for help.)
How to convert a file encoding in Perl?
Use the shell util
piconv for help.
piconv is installed by Perl and written in Perl. You can also look at the code to see how it's done.
◆ piconv piconv [-f from_encoding] [-t to_encoding] [-s string] [files...] piconv -l piconv -r encoding_alias -l,--list lists all available encodings -r,--resolve encoding_alias resolve encoding to its (Encode) canonical name -f,--from from_encoding when omitted, the current locale will be used -t,--to to_encoding when omitted, the current locale will be used -s,--string string "string" will be the input instead of STDIN or files The following are mainly of interest to Encode hackers: -D,--debug show debug information -C N | -c check the validity of the input -S,--scheme scheme use the scheme for conversion Those are handy when you can only see ascii characters: -p,--perlqq --htmlcref --xmlcref
For converting charset encodings in Perl, you need the Encoding module. It is bundled with Perl v5.8.6 or earlier.
See also: Perl Unicode Tutorial 🐫
See also: 〔How can I convert an input file to UTF-8 encoding in Perl? By Brian D Foy. @ stackoverflow.com…〕
The GNU command line tool “iconv” does character encoding conversion. Example:
iconv -f utf-16 -t utf-8 file1.txt > file2.txt. Use
iconv -l for a list of encodings.
If you use emacs, you can open the file, then call
set-buffer-file-coding-system with a
value such as “utf-8” or “utf-16” (press Tab ↹ to see available choices), then save the file.
〔☛ Emacs File/Character Encoding/Decoding FAQ〕
〔☛ Emacs ＆ Unicode Tips〕