Formal Definition of Systematic Grammar

By Xah Lee. Date: . Last updated: .

twitter extempore! on formally define the concept of “systematic grammar”

  1. over the past 2 decades, i had a idea about a comp lang with systematic/simple grammar.
  2. Now i know how to define it formally.
  3. Wolfram Language is one specific example am trying to describe, a idea of “systematic” grammar.
  4. on the surface, Wolfram language code is not at all “regular” in the normal English sense of “regular”
  5. if you look at it, it is extremely complex, perhaps more so than perl.
  6. however, it's systematic. Very regular. But, how on earth can you define this “regularity” or “systematic” or “structured” formally?
  7. A language whose context free grammar spec in BNF is itself a linear/regular language.
  8. if you have studied parsers, then you know what i mean.
  9. else, basically it means, a language's grammar of grammar is regular!
  10. now, let's dive in a bit about concept of “simple” grammar and “systematic” grammar.
  11. simple grammar, is for example, idealized LISP, XML, APL, TCL. Look at their code. Syntax is simple.
  12. first, a aside, why i say “idealized” lisp syntax? because lisp syntax, is not really regular in english sense. Nor is XML.
  13. lisp, everything is of the form x, or (a b c …), but actually, NOT!
  14. eg in lisp, you have (a . b), '(a b), `(a ,@ b ,c) ;comment. These are all irregular!
  15. in XML, you'd think everything is <f x=b …>…</f>, but there's also <?xml version="1.0" encoding="UTF-8"?>, broke regularity
  16. XML also has <!-- comment -->, and <![CDATA[…]]>, broke regularity.
  17. so, idealized LISP is “regular”. For example, in idealize LISP, all's x or nested (x …).
  18. idealized XML is “regular”. For example, in idealize XML, all's <f…>…</f>.
  19. now another aside. In computer science, the term “regular grammar” has very specific techincal meaning, not “regular” of English sense.
  20. In computer science, what “regular grammar” mean you can lookup. Basically it means what regex can match. Or, strings sans nesting.
  21. as soon as your lang has paren/brackets (thus nesting), its grammar is not regular in comp sci sense.
  22. the comp sci jargon of “regular grammar” and “regular language” is annoying, because it's misnomer/misleading.
  23. anyway, what we are talking here is “regular” in English sense. Similar to “uniform”, as in no exceptions. Also, close to “simple”.
  24. now, let's talk about the concept of “simple/regular” grammar. For example, idealized lisp syntax, or XML, or APL
  25. simple grammar is great, but they may not be flexible. eg lisp, you can't do 3+4/5. When it's long, put it in lisp is pain to read.
  26. so, how to keep syntax “regular/simple”, yet flexible? comes the idea of “systematic” grammar.
  27. let's illustrate a example of systematic grammar, with english!
  28. in english, let's start with subject verb object, for example, i love u, i drink coffee, i go home.
  29. now, create a set of words, and these can go before noun. We call these words adjectives. Same for adverb for verb
  30. now, for every noun, we can add “s” to the end to mean “more of it”, and NO EXCEPTIONS.
  31. eg, we have ass=1 ass, asss=more than 1 ass. mouse/mouses, man/mans, wife/wifes
  32. same way we SYSTEMATICALLY add inflections, and other, but NO Exceptions. You see, just few simple rules, but sentence can become flexible.
  33. this is what i meant “systematic” grammar. Given such a lang, the code may not look simple, but the grammar is simple.
  34. and the way to formally define this “systematic grammar”, is a grammar whose grammar is regular in comp sci sense.
  35. lisp, xml, APL, TCL, are somewhat simple grammar, but as far as i know only Wolfram Language is systematic grammar.
  36. while perl,ruby,C etc, their grammar are not simple, nor regular in any sense. They are, ad hoc grammar. Some come close to simple, ruby.
  37. don't confuse “simple” with “familiarity”. familiarity comes with exposure. Chinese 中文 is familiar/easy to me, but is not simple.
  38. C syntax and derivatives, for example: java C++ js, are familiar, easy, because C made it popular, wide exposure.

Systematic Grammar