Elisp: Write a Major Mode for Syntax Coloring

By Xah Lee. Date: . Last updated: .

Problem

You are writing a major mode for a new language. You want keywords of the language syntax colored.

Suppose your language source code looks like this:

```Sin[x]^2 + Cos[y]^2 == 1
Pi^2/6 == Sum[1/x^2,{x,1,Infinity}]```

You want the words Sin Cos Sum, colored as functions, and Pi Infinity colored as constants.

Solution

Save the following in a file.

```;; a simple major mode, mymath-mode

(setq mymath-fontlock
'(("Sin\\|Cos\\|Sum" . 'font-lock-function-name-face)
("Pi\\|Infinity" . 'font-lock-constant-face)))

(define-derived-mode mymath-mode fundamental-mode "mymath"
"major mode for editing mymath language code."
(setq font-lock-defaults '(mymath-fontlock)))```

Now, copy and paste the above code into a buffer, then Alt+x `eval-buffer`.

Now, type following code into a buffer:

```Sin[x]^2 + Cos[y]^2 == 1
Pi^2/6 == Sum[1/x^2,{x,1,Infinity}]```

Now, Alt+x mymath-mode, you see words colored.

How Does it Work

The string `"Sin\\|Cos\\|Sum"` is a Regular Expression

The font-lock-function-name-face is a predefined Elisp: Font Face

The line `define-derived-mode` defines a Major Mode named “mymath-mode”, based on the Fundamental Mode, and with display name “mymath”

The line `(setq font-lock-defaults '(mymath-fontlock))` sets up the syntax highlighting, using Font Lock Mode API.

Writing a Mode for a Language that Has Hundreds of Keywords

Typically, a language has hundreds of keywords. Elisp has a way to generate regex for your keywords.

Suppose you are writing a mode for the Linden Scripting Language (LSL). LSL has about 553 keywords. First, here's a sample of LSL source code so you get some idea of how we want it colored.

```// sample LSL file

// Examples of variable declaration and assignment:
integer score = 0;
string mySay = "i ♥ you";
vector v = <3,4,5>;
list myList= [2,4,7,3];

// Example of defining a function.
integer sum(integer a, integer b)
{
integer result = a + b;
return result;
}

default
{
state_entry()
{
llSay(0, mySay);
}

touch_start(integer total_number)
{
if (score == 1) {
llSay(0, mySay);
} else {
llWhisper(0, "Ouch!");
}
}
}```

Each type of keyword uses a different color.

Here's the code.

```;;; xls-mode.el --- sample major mode for editing LSL. -*- coding: utf-8; lexical-binding: t; -*-

(defvar xls-keywords nil "lsl keywords")
(setq xls-keywords '("break" "default" "do" "else" "for" "if" "return" "state" "while"))

(defvar xls-types nil "lsl types")
(setq xls-types '("float" "integer" "key" "list" "rotation" "string" "vector"))

(defvar xls-constants nil "lsl constants")
(setq xls-constants '("ACTIVE" "AGENT" "ALL_SIDES" "ATTACH_BACK"))

(defvar xls-events nil "lsl events")
(setq xls-events '("state_entry" "touch_start" "attach"))

(defvar xls-functions nil "lsl functions")

(defvar xls-fontlock nil "list for font-lock-defaults")
(setq xls-fontlock
(let (xkeywords-regex xtypes-regex xconstants-regex xevents-regex)

;; generate regex for each category of keywords
(setq xkeywords-regex (regexp-opt xls-keywords 'words))
(setq xtypes-regex (regexp-opt xls-types 'words))
(setq xconstants-regex (regexp-opt xls-constants 'words))
(setq xevents-regex (regexp-opt xls-events 'words))
(setq xfunctions-regex (regexp-opt xls-functions 'words))

;; note: order matters, because once colored, that part won't change. In general, put longer words first
(list (cons xtypes-regex 'font-lock-type-face)
(cons xconstants-regex 'font-lock-constant-face)
(cons xevents-regex 'font-lock-builtin-face)
(cons xfunctions-regex 'font-lock-function-name-face)
(cons xkeywords-regex 'font-lock-keyword-face))))

(define-derived-mode xls-mode c-mode "xls mode"
"Major mode for editing Linden Scripting Language"

;; code for syntax highlighting
(setq font-lock-defaults '((xls-fontlock))))

;; add the mode to the `features' list
(provide 'xls-mode)

;;; xls-mode.el ends here
```

Now, Alt+x `eval-buffer`. [see Evaluate Emacs Lisp Code]

Open the LSL language sample file given above, then Alt+x `xls-mode`. Here's the result:

The line:

`(provide 'xls-mode)`

adds the symbol `xls-mode` to the variable features list. [see Elisp: provide, require, features]

Font Lock Mode Basics

For many languages, the syntax coloring are not fixed set of strings. For example, in XML, you have `<xyz>something</xyz>` pattern where the xyz can be anything.

for detail, see Elisp: Font Lock Mode