A context-free grammar is defined by the 4-
tuple G = (V, \Sigma, R, S), where • is a finite set; each element v\in V is called
a nonterminal character or a
variable. Each variable represents a different type of phrase or clause in the sentence. Variables are also sometimes called syntactic categories. Each variable defines a sub-language of the language defined by . • is a finite set of
terminals, disjoint from , which make up the actual content of the sentence. The set of terminals is the alphabet of the language defined by the grammar . • is a finite
relation in V\times(V\cup\Sigma)^{*}, where the asterisk represents the
Kleene star operation. The members of are called the
(rewrite) rules or
productions of the grammar (also commonly symbolized by a ). • is the start variable (or start symbol), used to represent the whole sentence (or program). It must be an element of .
Production rule notation A
production rule in is formalized mathematically as a pair (\alpha, \beta)\in R, where \alpha \in V is a nonterminal and \beta \in (V\cup\Sigma)^{*} is a
string of variables and/or terminals; rather than using
ordered pair notation, production rules are usually written using an arrow operator with \alpha as its left hand side and as its right hand side: \alpha\rightarrow\beta. It is allowed for to be the
empty string, and in this case it is customary to denote it by . The form \alpha\rightarrow\varepsilon is called an ε-production. It is common to list all right-hand sides for the same left-hand side on the same line, using | (the
vertical bar) to separate them. Rules \alpha\rightarrow \beta_1 and \alpha\rightarrow\beta_2 can hence be written as \alpha\rightarrow\beta_1\mid\beta_2. In this case, \beta_1 and \beta_2 are called the first and second alternative, respectively.
Rule application For any strings u, v\in (V\cup\Sigma)^{*}, we say directly yields , written as u\Rightarrow v\,, if \exists (\alpha, \beta)\in R with \alpha \in V and u_{1}, u_{2}\in (V\cup\Sigma)^{*} such that u\,=u_{1}\alpha u_{2} and v\,=u_{1}\beta u_{2}. Thus, is a result of applying the rule (\alpha, \beta) to .
Repetitive rule application For any strings u, v\in (V\cup\Sigma)^{*}, we say
yields or is
derived from if there is a positive integer and strings u_{1}, \ldots, u_{k}\in (V\cup\Sigma)^{*} such that u = u_{1} \Rightarrow u_{2} \Rightarrow \cdots \Rightarrow u_{k} = v. This relation is denoted u ~\stackrel{*}{\Rightarrow}~ v, or u\Rightarrow\Rightarrow v in some textbooks. If k\geq 2, the relation u ~\stackrel{+}{\Rightarrow}~ v holds. In other words, (\stackrel{*}{\Rightarrow}) and (\stackrel{+}{\Rightarrow}) are the
reflexive transitive closure (allowing a string to yield itself) and the
transitive closure (requiring at least one step) of (\Rightarrow), respectively.
Context-free language The language of a grammar G = (V, \Sigma, R, S) is the set : L(G) = \{ w\in\Sigma^{*} : S ~\stackrel{*}{\Rightarrow}~ w\} of all terminal-symbol strings derivable from the start symbol. A language is said to be a context-free language (CFL), if there exists a CFG , such that L=L(G).
Non-deterministic pushdown automata recognize exactly the context-free languages. == Examples ==