Java: Unicode in Source Code

By Xah Lee. Date: . Last updated: .

File Encoding

When saving file, make sure you save it as UTF8 encoding. Your editor should have a option to do so. [see Set Text Editor File Encoding]

If you have errors in compiling, try

javac -encoding UTF8 filename

[see Unicode Basics: Character Set, Encoding, UTF-8]

Identifier Names with Unicode Characters

Class, function, variable names can be NON-ASCII characters, such as Greek letters or math symbols.

class X {
    public static void main(String[] arg) {

    String  = "north😸";
    double π = 3.1;

    System.out.println( 北 );
    System.out.println( π );
    }
}

Unicode Character Escape Syntax

Any character in source code can also be represented by its Unicode codepoint. By starting with \u followed by its 4 digits hexadecimal code.

class X2 \u007b
    static \u0069nt \u03b1 = 3;
    public static void main(String[] arg) {
    System.out.println( \u03b1 );
    }
}

In the above example,

To find Unicode character and its codepoint and hexadecimal, see Unicode Search 😄