Clojure Instaparse Parser Tutorial: Hide Tokens
Instaparse can hide tokens or entire grammar rule in output. (this is sometimes useful. For example, space terminal symbol in 1 + 2
.)
To hide, use greater and less equal sign as brackets <…>
in the grammar.
Here's a example without hiding:
(ns example.core (:require [instaparse.core :as insta])) (def qq (insta/parser "S = DIGIT (SPACES '+' SPACES DIGIT)*; SPACES = ' '*; DIGIT = '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9';" )) (qq "1") ;; output ;; [:S [:DIGIT "1"]] (qq "3+4") ;; output ;; [:S [:DIGIT "3"] [:SPACES] "+" [:SPACES] [:DIGIT "4"]] (qq "3 +4") ;; output ;; [:S [:DIGIT "3"] [:SPACES " "] "+" [:SPACES] [:DIGIT "4"]] (qq "3 + 4") ;; output ;; [:S [:DIGIT "3"] [:SPACES " "] "+" [:SPACES " "] [:DIGIT "4"]] (qq "3+4+2") ;; output ;; [:S [:DIGIT "3"] [:SPACES] "+" [:SPACES] [:DIGIT "4"] [:SPACES] "+" [:SPACES] [:DIGIT "2"]]
Here's a example with hiding:
(def q2 (insta/parser "S = DIGIT (SPACES '+' SPACES DIGIT)*; <SPACES> = ' '*; DIGIT = '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9';" )) (q2 "3+4") ;; output ;; [:S [:DIGIT "3"] "+" [:DIGIT "4"]] (q2 "3 +4") ;; output ;; [:S [:DIGIT "3"] " " "+" [:DIGIT "4"]] (q2 "3 + 4") ;; output ;; [:S [:DIGIT "3"] " " "+" " " [:DIGIT "4"]]
Hiding can be used on tokens (terminal symbols) too.
(def q3 (insta/parser "S = DIGIT (<SPACES> '+' <SPACES> DIGIT)*; SPACES = ' '*; DIGIT = '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9';" )) (q3 "3+4") ;; output ;; [:S [:DIGIT "3"] "+" [:DIGIT "4"]] (q3 "3 +4") ;; output ;; [:S [:DIGIT "3"] "+" [:DIGIT "4"]] (q3 "3 + 4") ;; output ;; [:S [:DIGIT "3"] "+" [:DIGIT "4"]]
Example 2
Here's a example without hiding:
(def h1 (insta/parser "A = '(' B ')'; B = ('x' | 'y')*")) (h1 "(xyx)") ;; output ;; [:A "(" [:B "x" "y" "x"] ")"]
Here's a example with hiding:
(def h2 (insta/parser "A = <'('> B <')'>; B = ('x' | 'y')*")) (h2 "(xyx)") ;; output ;; [:A [:B "x" "y" "x"]]
(def qq (insta/parser "S = DIGIT (SPACES '+' SPACES DIGIT)*; SPACES = ' '*; DIGIT = '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9';" )) (qq "1") ;; output ;; [:S [:DIGIT "1"]] (qq "3+4") ;; output ;; [:S [:DIGIT "3"] [:SPACES] "+" [:SPACES] [:DIGIT "4"]] (qq "3 +4") ;; output ;; [:S [:DIGIT "3"] [:SPACES " "] "+" [:SPACES] [:DIGIT "4"]] (qq "3 + 4") ;; output ;; [:S [:DIGIT "3"] [:SPACES " "] "+" [:SPACES " "] [:DIGIT "4"]] (qq "3+4+2") ;; output ;; [:S [:DIGIT "3"] [:SPACES] "+" [:SPACES] [:DIGIT "4"] [:SPACES] "+" [:SPACES] [:DIGIT "2"]]
Unhide Parse Tree Elements
You can unhide parese tree elements, by giving the parser function the option one of:
:unhide :content
- unhide terminal symbols.
:unhide :tags
- unhide rules.
:unhide :all
- unhide both terminal symbols and rules.
(def p4 (insta/parser "S = DIGIT (<SPACES> '+' <SPACES> DIGIT)*; <SPACES> = ' '*; DIGIT = '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9';" )) (p4 "3 +4") ;; output ;; [:S [:DIGIT "3"] "+" [:DIGIT "4"]] (p4 "3 +4" :unhide :content) ;; output ;; [:S [:DIGIT "3"] " " "+" [:DIGIT "4"]] (p4 "3 +4" :unhide :tags) ;; output ;; [:S [:DIGIT "3"] "+" [:DIGIT "4"]] (p4 "3 +4" :unhide :all) ;; output ;; [:S [:DIGIT "3"] [:SPACES " "] "+" [:SPACES] [:DIGIT "4"]]