What's the Difference Between BNF, EBNF, ABNF?

By Xah Lee. Date: . Last updated: .

warning: work in progress.

What is the Difference Between BNF, EBNF, ABNF?

BNF is the original, most simple, mostly used in academic papers of theoretical context, for communicating to humans. (as opposed to being used in compiler/parser.)

EBNF means Extended BNF. There's not one single EBNF, but many.

ABNF (augmented BNF) is a rather very different format that BNF, but is more standardized. It is harder to read, but is most used in parsers.

In terms of power, they are all equivalent.

They are just syntactical differences. For example, in traditional BNF, the lhs/rhs separator is ::=, but in books, often . In EBNF and ABNF it's =.

Another example, in traditional BNF, nonterminals are written with brackets around it such as <EXPR> and terminals are just plain characters.

In ABNF, nonterminals are plain, and terminals are bracketed with double quotes, like this "+".

In BNF, the symbol for alternatives is a vertical line |. In ABNF, the symbol for alternatives is a slash /.

EBNF and ABNF also features shortcut grammar syntax, such as specifying 0 or more of the preceding nonterminal/terminal. To translate it to BNF, you'll need to introduce several more rules and nonterminals.

In general, BNF notation is good for teaching, explanation, theoretical discussion. It is simple. EBNF and especially ABNF are more used to actually implement grammar and read by parsers.

In general, BNF notation is good for teaching, explanation, theoretical discussion. It is simple. EBNF and especially ABNF are more used to actually implement grammar and read by parsers.

example BNF:

postal-address ::= name-part street-address zip-part name-part ::= personal-part last-name opt-suffix-part EOL | personal-part name-part personal-part ::= first-name | initial "." street-address ::= house-num street-name opt-apt-num EOL zip-part ::= town-name "," state-code ZIP-code EOL opt-suffix-part ::= "Sr." | "Jr." | roman-numeral | "" opt-apt-num ::= apt-num | ""

note: this example is incomplete. For example, name-part is not defined.

example from Backus–Naur Form

Note:

Extended Backus Naur Form

There are several “Extended Backus Naur Form”. Here's a example describing a simplified Pascal syntax:

(* a simple program syntax in EBNF − Wikipedia *)
program = 'PROGRAM', white space, identifier, white space,
           'BEGIN', white space,
           { assignment, ";", white space },
           'END.' ;
identifier = alphabetic character, { alphabetic character | digit } ;
number = [ "-" ], digit, { digit } ;
string = '"' , { all characters - '"' }, '"' ;
assignment = identifier , ":=" , ( number | identifier | string ) ;
alphabetic character = "A" | "B" | "C" | "D" | "E" | "F" | "G"
                     | "H" | "I" | "J" | "K" | "L" | "M" | "N"
                     | "O" | "P" | "Q" | "R" | "S" | "T" | "U"
                     | "V" | "W" | "X" | "Y" | "Z" ;
digit = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9" ;
white space = ? white space characters ? ;
all characters = ? all visible characters ? ;

here's Pascal code described by it:

PROGRAM DEMO1
 BEGIN
   A:=3;
   B:=45;
   H:=-100023;
   C:=A;
   D123:=B34A;
   BABOON:=GIRAFFE;
   TEXT:="Hello world!";
 END.
UsageNotation
definition=
concatenation,
termination ;
alternation|
option[ ... ]
repetition{ ... }
grouping( ... )
terminal string" ... "
terminal string' ... '
comment(* ... *)
special sequence ? ... ?
exception-

note:

Terminals are enclosed by a double quote pair ". All others are non-terminal. (except special symbols)

some character and their meanings:

Advantages over BNF

Any grammar defined in EBNF can also be represented in BNF though representations in the latter are generally lengthier. e.g., options and repetitions cannot be directly expressed in BNF and require the use of an intermediate rule or alternative production defined to be either nothing or the optional production for option, or either the repeated production of itself, recursively, for repetition. The same constructs can still be used in EBNF.

The BNF uses the symbols (<, >, |, ::=) for itself, but does not include quotes around terminal strings. This prevents these characters from being used in the languages, and requires a special symbol for the empty string. In EBNF, terminals are strictly enclosed within quotation marks (“…” or ‘…’). The angle brackets (“<…>“) for nonterminals can be omitted.

BNF syntax can only represent a rule in one line, whereas in EBNF a terminating character, the semicolon, marks the end of a rule.

Furthermore, EBNF includes mechanisms for enhancements, defining the number of repetitions, excluding alternatives, comments, etc.

from Extended Backus–Naur Form

note: majority of so-called EBNF are sloppy for communication with humans. There are lots ambiguities and not well defined.

Augmented Backus–Naur Form

this is the worst lot. ABNF is not really human friendly. Example:

telephone-uri        = "tel:" telephone-subscriber
telephone-subscriber = global-number / local-number
global-number        = global-number-digits *par
local-number         = local-number-digits *par context *par
par                  = parameter / extension / isdn-subaddress
isdn-subaddress      = ";isub=" 1*uric
extension            = ";ext=" 1*phonedigit
context              = ";phone-context=" descriptor
descriptor           = domainname / global-number-digits
global-number-digits = "+" *phonedigit DIGIT *phonedigit
local-number-digits  = *phonedigit-hex (HEXDIG / "*" / "#") *phonedigit-hex
domainname           = *( domainlabel "." ) toplabel [ "." ]
domainlabel          = alphanum
                       / alphanum *( alphanum / "-" ) alphanum
toplabel             = ALPHA / ALPHA *( alphanum / "-" ) alphanum
parameter            = ";" pname ["=" pvalue ]
pname                = 1*( alphanum / "-" )
pvalue               = 1*paramchar
paramchar            = param-unreserved / unreserved / pct-encoded
unreserved           = alphanum / mark
mark                 = "-" / "_" / "." / "!" / "~" / "*" /
                       "'" / "(" / ")"
pct-encoded          = "%" HEXDIG HEXDIG
param-unreserved     = "[" / "]" / "/" / ":" / "&" / "+" / "$"
phonedigit           = DIGIT / [ visual-separator ]
phonedigit-hex       = HEXDIG / "*" / "#" / [ visual-separator ]
visual-separator     = "-" / "." / "(" / ")"
alphanum             = ALPHA / DIGIT
reserved             = ";" / "/" / "?" / ":" / "@" / "&" /
                       "=" / "+" / "$" / ","
uric                 = reserved / unreserved / pct-encoded
postal-address   = name-part street zip-part

name-part        = *(personal-part SP) last-name [SP suffix] CRLF
name-part        =/ personal-part CRLF

personal-part    = first-name / (initial ".")
first-name       = *ALPHA
initial          = ALPHA
last-name        = *ALPHA
suffix           = ("Jr." / "Sr." / 1*("I" / "V" / "X"))

street           = [apt SP] house-num SP street-name CRLF
apt              = 1*4DIGIT
house-num        = 1*8(DIGIT / ALPHA)
street-name      = 1*VCHAR

zip-part         = town-name "," SP state 1*2SP zip-code CRLF
town-name        = 1*(ALPHA / SP)
state            = 2ALPHA
zip-code         = 5DIGIT ["-" 4DIGIT]

see also https://github.com/Engelberg/instaparse/blob/master/docs/ABNF.md, by Mark Engelberg.

[see Clojure Instaparse Parser Tutorial]

Ask me question on patreon