Starting with Python version 2.4, you can use it to change a file's character encoding. Here's a example:
# -*- coding: utf-8 -*- # python path='infile.html' path2='outfile.html' f= open(path, 'rb') content= unicode(f.read(), 'gb18030') f.close() f= open(path2, 'wb') f.write(content.encode('utf-8')) f.close()
See also:
(thanks to Andrew Clover for help.)
Perl installs a char encoding converter script at /usr/bin/piconv. Type piconv for help. You can also look at the code to see how it's done.
For converting charset encodings in Perl, you need the Encoding module. It is bundled with Perl v5.8.6 or earlier. In general, for Perl with Unicode support, see: Unicode in Perl & Python.
See: 〔How can I convert an input file to UTF-8 encoding in Perl? By Brian D Foy. @ stackoverflow.com…〕
The GNU command line tool “iconv” does character encoding conversion. Example: iconv -f utf-16 -t utf-8 file1.txt > file2.txt. Use iconv -l for a list of encodings.
If you use emacs, you can open the file, then call set-buffer-file-coding-system with a
value such as “utf-8” or “utf-16”.
See: Emacs and Unicode Tips.