MathCurvesSurfacesWallpaper GroupsGallerySoftwarePOV-Ray
ProgramingLinuxPerl PythonHTMLCSSJavaScriptPHPJavaEmacsUnicode ♥
Web Hosting by 1&1

Unicode in Ruby, Perl, Python, JavaScript, Java, Emacs Lisp, Mathematica

Xah Lee, , …,

This page shows which language support using Unicode symbols in variable name (⁖ α = 5), or in function name (⁖ φ(4)), or in user-defined operator such as {8} ⊕ {5}.

Of these, only Mathematica supports user defined operators, and it can be a Unicode character.

Ruby

# -*- coding: utf-8 -*-
# ruby

i♥NY = "♥"

def λ n
  n + "美"
end

p λ i♥NY                        # ⇒ "♥美"

more detail, see: Unicode in Ruby 💎.

ruby1.9 supports Unicode in function/variables names, but constant's name can't start with non-latin symbol (because of normalization problem), and file must have encoding marker (like coding: utf-8)

thanks to Hleb Valoshka

Perl

Perl, since about 2010, has good Unicode support. Unicode can be in var or function names. Example:

# -*- coding: utf-8 -*-
# perl 5.14

use strict;
use utf8; # necessary if you want to use Unicode in function or var names

# processing Unicode string
my $s = 'I ★ you';
$s =~ s///;
print "$s\n";

# variable with Unicode char
my $愛 = 4;
print "$愛\n";

# function with Unicode char
sub f愛 { return 2;}
print f愛();

Detail at: Unicode in Perl.

Python

Python 2.x's Unicode support is not very good. But does work for processing Unicode in string.

If you are processing lots of files, and if one of the file contains a bad char or doesn't use encoding you expected, your python script chokes dead in the middle, you don't even know which file it is or which line unless your code print file names. If you are processing a few thousand files in a dir with all sub-dirs, good luck in finding out which files have already been processed.

Python 2.x does not support Unicode char for var or function names.

Python 3 fixed the Unicode problem. Python 3 supports Unicode in var names. Example:

# -*- coding: utf-8 -*-  ← this line is optional
# python 3

def ƒ(n):
    return n+1

α = 4
print(ƒ(α)) # prints 5

Detail at: Unicode in Python.

JavaScript

JavaScript supports Unicode in var name and function name. As of today , all browsers support it: IE8, Firefox 3.6.13, Chrome 8.0.552.237, Safari 5.0.3 , Opera 11.00. (tested on Windows Vista)

// -*- coding: utf-8 -*-

var 愛 = 3;

function λ(n) {return n+1;}

alert( λ(愛) );

However, it appears that some chars cannot be used. For example, a variable with name generates errors.

Here's a page you can test yourself: javascript Unicode support test.

Emacs Lisp

For text processing, the most beautiful lang with respect to Unicode is emacs lisp. In elisp, you don't have to declare none of the Unicode or encoding stuff. You simply write code to process string or files, without even having to know what encoding it is. Emacs the environment takes care of all that.

Emacs Lisp also supports Unicode in var/function names. For example:

(defun β ()
  "Inserts stuff"
  (interactive)
  (let ((α "♥ 愛 ☯"))
    (insert α)  
    )
  )

(to try the above in emacs: paste the above into a empty file, then select it, then call eval-region to make emacs eval it. Now, you can press 【Alt+x】 then type β (just copy paste), it'll insert “♥ 愛 ☯”.) (See: Emacs and Unicode TipsEmacs Lisp Basics.)

Java

Java supports Unicode fully, including use in var/class/method names. Example:

class  {
    String  = "north";
    double π = 3.14159;
}

class UnicodeTest {
    public static void main(String[] arg) {
        方 x1 = new ();
        System.out.println( x1.北 );
        System.out.println( x1.π );
    }
}

Detail: Java Tutorial: Unicode in Java.

Mathematica

Mathematica (Ɱ) supports Unicode extensively, in variable names, function names, and you can define your own operators where the operator is a symbol in Unicode.

Technically, Ɱ source code is ASCII. Characters in Unicode or Ɱ's own set of math symbols are represented by a markup, much like HTML entities. However, Ɱ editor (the Front End) displays it rendered, and there's elaborate system for user to input math symbols.

Linden Scripting Language (LSL)

LSL also supports Unicode in function or variable names. Example:

string aασ∑♥ = "var with Unicode char in name";

string tασ∑♥() { return "function with Unicode char in name";}

default
{
    state_entry()
    {
        llSay(0, "Hello, Avatar!");
    }
    
    touch_start(integer num_detected)
    {
        llSay(0, (string) tασ∑♥() + "; " + (string) aασ∑♥);
    }
}

See: Xah's Linden Scripting Language (LSL) TutorialLinden Scripting Language (LSL) Unicode Support.

Why Use Unicode in Variable Names?

See: Programing Style: Variable Naming: English Words Considered Harmful.

blog comments powered by Disqus