Elisp: Write a Major Mode for Syntax Coloring

By Xah Lee. Date: . Last updated: .

Problem

You are writing a major mode for a new language. You want keywords of the language syntax colored.

mymath mode 2024-05-11
mymath mode 2024-05-11

Suppose your language source code looks like this:

Sin[x]^2 + Cos[y]^2 == 1
Pi^2/6 == Sum[1/x^2,{x,1,Infinity}]

You want the words Sin Cos Sum, colored as functions, and Pi Infinity colored as constants.

Solution

Save the following in a file.

;; a simple major mode, mymath-mode

(setq mymath-fontlock
      '(("Sin\\|Cos\\|Sum" . 'font-lock-function-name-face)
        ("Pi\\|Infinity" . 'font-lock-constant-face)))

(define-derived-mode mymath-mode fundamental-mode "mymath"
  "major mode for editing mymath language code."
  (setq font-lock-defaults '(mymath-fontlock)))

Now, copy and paste the above code into a buffer, then Alt+x eval-buffer.

Now, type following code into a buffer:

Sin[x]^2 + Cos[y]^2 == 1
Pi^2/6 == Sum[1/x^2,{x,1,Infinity}]

Now, Alt+x mymath-mode, you see words colored.

How Does it Work

The string "Sin\\|Cos\\|Sum" is a Regular Expression

The font-lock-function-name-face is a predefined Elisp: Font Face

The line define-derived-mode defines a Major Mode named “mymath-mode”, based on the Fundamental Mode, and with display name “mymath”

The line (setq font-lock-defaults '(mymath-fontlock)) sets up the syntax highlighting, using Font Lock Mode API.

Writing a Mode for a Language that Has Hundreds of Keywords

Typically, a language has hundreds of keywords. Elisp has a way to generate regex for your keywords.

Suppose you are writing a mode for the Linden Scripting Language (LSL). LSL has about 553 keywords. First, here's a sample of LSL source code so you get some idea of how we want it colored.

// sample LSL file

// Examples of variable declaration and assignment:
integer score = 0;
string mySay = "i ♥ you";
vector v = <3,4,5>;
list myList= [2,4,7,3];

// Example of defining a function.
// built-in function's names start with “ll” (Linden Library).
integer sum(integer a, integer b)
{
    integer result = a + b;
    return result;
}

 default
 {
     state_entry()
     {
         llSay(0, mySay);
     }

     touch_start(integer total_number)
     {
         if (score == 1) {
             llSay(0, mySay);
         } else {
             llWhisper(0, "Ouch!");
         }
     }
 }

Each type of keyword uses a different color.

Here's the code.

;;; xls-mode.el --- sample major mode for editing LSL. -*- coding: utf-8; lexical-binding: t; -*-

(defvar xls-keywords nil "lsl keywords")
(setq xls-keywords '("break" "default" "do" "else" "for" "if" "return" "state" "while"))

(defvar xls-types nil "lsl types")
(setq xls-types '("float" "integer" "key" "list" "rotation" "string" "vector"))

(defvar xls-constants nil "lsl constants")
(setq xls-constants '("ACTIVE" "AGENT" "ALL_SIDES" "ATTACH_BACK"))

(defvar xls-events nil "lsl events")
(setq xls-events '("state_entry" "touch_start" "attach"))

(defvar xls-functions nil "lsl functions")
(setq xls-functions '("llWhisper" "llSay" "llAddToLandBanList"))

(defvar xls-fontlock nil "list for font-lock-defaults")
(setq xls-fontlock
      (let (xkeywords-regex xtypes-regex xconstants-regex xevents-regex)

        ;; generate regex for each category of keywords
        (setq xkeywords-regex (regexp-opt xls-keywords 'words))
        (setq xtypes-regex (regexp-opt xls-types 'words))
        (setq xconstants-regex (regexp-opt xls-constants 'words))
        (setq xevents-regex (regexp-opt xls-events 'words))
        (setq xfunctions-regex (regexp-opt xls-functions 'words))

        ;; note: order matters, because once colored, that part won't change. In general, put longer words first
        (list (cons xtypes-regex 'font-lock-type-face)
              (cons xconstants-regex 'font-lock-constant-face)
              (cons xevents-regex 'font-lock-builtin-face)
              (cons xfunctions-regex 'font-lock-function-name-face)
              (cons xkeywords-regex 'font-lock-keyword-face))))

;;;###autoload
(define-derived-mode xls-mode c-mode "xls mode"
  "Major mode for editing Linden Scripting Language"

  ;; code for syntax highlighting
  (setq font-lock-defaults '((xls-fontlock))))

(add-to-list 'auto-mode-alist '("\\.lsl\\'" . xls-mode))

;; add the mode to the `features' list
(provide 'xls-mode)

;;; xls-mode.el ends here

Now, Alt+x eval-buffer. [see Evaluate Emacs Lisp Code]

Open the LSL language sample file given above, then Alt+x xls-mode. Here's the result:

xls mode 2024-05-11
xls mode 2024-05-11

The line:

(provide 'xls-mode)

adds the symbol xls-mode to the variable features list. [see Elisp: provide, require, features]

Font Lock Mode Basics

For many languages, the syntax coloring are not fixed set of strings. For example, in XML, you have <xyz>something</xyz> pattern where the xyz can be anything.

for detail, see Elisp: Font Lock Mode

Emacs lisp, writing a major mode. Essentials

Elisp, font lock, syntax coloring