Java Tutorial: Unicode in Java

By Xah Lee. Date: ,

One thing great about Java is that it is Unicode based. That means, you can use characters from writing systems that are not english alphabets (➢ for example: Chinese or math symbols), not just in data strings, but in function and variable names too.

Here's a example code using Unicode characters in class names and variable names.

class  {
    String  = "north";
    double π = 3.14159;
}

class UnicodeTest {
    public static void main(String[] arg) {
        方 x1 = new ();
        System.out.println( x1.北 );
        System.out.println( x1.π );
    }
}

Any character in source code can also be represented by its Unicode number. By starting with \u followed by its 4 digits hexadecimal code.

class TestUniEsp \u007b
    static \u0069nt \u611b = 3;
    public static void main(String[] arg) {
    System.out.println( \u611b );
    }
}

In the above example, \u007b is the left curly braces “{”, \u0069 is lowercase “i”, \u611b is the Chinese char “愛” (meaning love).

Reference

http://docs.oracle.com/javase/specs/jls/se8/html/jls-3.html

For a good discussion of exactly what char is allowed in identifier, see: languages with full Unicode support By David Hopwood. @ http://groups.google.com/group/comp.lang.perl.misc/msg/8f6aa81c992a22cc