This page shows which language support using Unicode symbols in variable name (⁖ α = 5), or in function name (⁖ φ(4)), or in user-defined operator such as {8} ⊕ {5}.
Of these, only Mathematica supports user defined operators, and it can be a Unicode character.
# -*- coding: utf-8 -*- # ruby i♥NY = "♥" def λ n n + "美" end p λ i♥NY # ⇒ "♥美"
more detail, see: Unicode in Ruby 💎.
ruby1.9 supports Unicode in function/variables names, but constant's name can't start with non-latin symbol (because of normalization problem), and file must have encoding marker (like coding: utf-8)
thanks to Hleb Valoshka
Perl, since about 2010, has good Unicode support. Unicode can be in var or function names. Example:
# -*- coding: utf-8 -*- # perl 5.14 use strict; use utf8; # necessary if you want to use Unicode in function or var names # processing Unicode string my $s = 'I ★ you'; $s =~ s/★/♥/; print "$s\n"; # variable with Unicode char my $愛 = 4; print "$愛\n"; # function with Unicode char sub f愛 { return 2;} print f愛();
Detail at: Unicode in Perl.
Python 2.x's Unicode support is not very good. But does work for processing Unicode in string.
u"α β"), you'll have to declare your source code with header like #-*- coding: utf-8 -*-.u"α β".re.search(r'\.html$',child,re.U).unicode(inF.read(),'utf-8'), and printing out Unicode you have to do outF.write(outtext.encode('utf-8')).If you are processing lots of files, and if one of the file contains a bad char or doesn't use encoding you expected, your python script chokes dead in the middle, you don't even know which file it is or which line unless your code print file names. If you are processing a few thousand files in a dir with all sub-dirs, good luck in finding out which files have already been processed.
Python 2.x does not support Unicode char for var or function names.
Python 3 fixed the Unicode problem. Python 3 supports Unicode in var names. Example:
# -*- coding: utf-8 -*- ← this line is optional # python 3 def ƒ(n): return n+1 α = 4 print(ƒ(α)) # prints 5
Detail at: Unicode in Python.
JavaScript supports Unicode in var name and function name. As of today , all browsers support it: IE8, Firefox 3.6.13, Chrome 8.0.552.237, Safari 5.0.3 , Opera 11.00. (tested on Windows Vista)
// -*- coding: utf-8 -*- var 愛 = 3; function λ(n) {return n+1;} alert( λ(愛) );
However, it appears that some chars cannot be used. For example, a variable with name ♥ generates errors.
Here's a page you can test yourself: javascript Unicode support test.
For text processing, the most beautiful lang with respect to Unicode is emacs lisp. In elisp, you don't have to declare none of the Unicode or encoding stuff. You simply write code to process string or files, without even having to know what encoding it is. Emacs the environment takes care of all that.
Emacs Lisp also supports Unicode in var/function names. For example:
(defun β () "Inserts stuff" (interactive) (let ((α "♥ 愛 ☯")) (insert α) ) )
(to try the above in emacs: paste the above into a empty file, then select it, then call eval-region to make emacs eval it. Now, you can press 【Alt+x】 then type β (just copy paste), it'll insert “♥ 愛 ☯”.)
(See: Emacs and Unicode Tips ◇
Emacs Lisp Basics.)
Java supports Unicode fully, including use in var/class/method names. Example:
class 方 { String 北 = "north"; double π = 3.14159; } class UnicodeTest { public static void main(String[] arg) { 方 x1 = new 方(); System.out.println( x1.北 ); System.out.println( x1.π ); } }
Detail: Java Tutorial: Unicode in Java.
Mathematica (Ɱ) supports Unicode extensively, in variable names, function names, and you can define your own operators where the operator is a symbol in Unicode.
Technically, Ɱ source code is ASCII. Characters in Unicode or Ɱ's own set of math symbols are represented by a markup, much like HTML entities. However, Ɱ editor (the Front End) displays it rendered, and there's elaborate system for user to input math symbols.
LSL also supports Unicode in function or variable names. Example:
string aασ∑♥ = "var with Unicode char in name"; string tασ∑♥() { return "function with Unicode char in name";} default { state_entry() { llSay(0, "Hello, Avatar!"); } touch_start(integer num_detected) { llSay(0, (string) tασ∑♥() + "; " + (string) aασ∑♥); } }
See: Xah's Linden Scripting Language (LSL) Tutorial ◇ Linden Scripting Language (LSL) Unicode Support.
See: Programing Style: Variable Naming: English Words Considered Harmful.