Perl: Unicode Tutorial 🐪
unicode intro
utf8 encoded perl source code, use utf8
If your Perl script is encoded in UTF-8 , then you need to declare it, like this:
use utf8;
for processing unicode string, you need at least
use v5.14;
use strict; use utf8; # processing Unicode string my $s = 'I ★ you'; $s =~ s/★/♥/; print "$s\n";
Jamie W Zawinski on how perl unicode sucks
use bytes; # Larry can take Unicode and shove it up his ass sideways. # Perl 5.8.0 causes us to start getting incomprehensible # errors about UTF-8 all over the place without this.
from the source code of WebCollage (1998), by Jamie W Zawinski (born 1968)
when calling scripts that process Unicode, call it with -C
option in the command line.
Perl Unicode tips from Tom Christiansen
Here's some Unicode tips, gathered from Tom Christiansen's answer at 〔Why does modern Perl avoid UTF-8 by default? http://stackoverflow.com/questions/6162484/why-does-modern-perl-avoid-utf-8-by-default/6163129#6163129〕.
• Set your PERL_UNICODE
environment variable to AS
. This makes all Perl scripts decode @ARGV
as UTF‑8 strings, and sets the encoding of all three of stdin, stdout, and stderr to UTF‑8. Both these are global effects.
• Enable warnings.
use warnings; use warnings qw( FATAL utf8 );
• Declare that anything that opens a filehandles within this lexical scope.
use open qw( :encoding(UTF-8) :std );
• If you have a DATA handle, you must explicitly set its encoding. If you want this to be UTF‑8, then say: binmode(DATA, ":encoding(UTF-8)");