Indexed grammar

Modern definition by Hopcroft and Ullman In contemporary publications following Hopcroft and Ullman (1979), an indexed grammar is formally defined a 5-tuple G = ⟨N,T,F,P,S⟩ where • N is a set of variables or nonterminal symbols, • T is a set ("alphabet") of terminal symbols, • F is a set of so-called index symbols, or indices, • S ∈ N is the start symbol, and • P is a finite set of productions. In productions as well as in derivations of indexed grammars, a string ("stack") σ ∈ F* of index symbols is attached to every nonterminal symbol A ∈ N, denoted by A[σ]. Terminal symbols may not be followed by index stacks. For an index stack σ ∈ F* and a string α ∈ (N ∪ T)* of nonterminal and terminal symbols, α[σ] denotes the result of attaching [σ] to every nonterminal in α; for example if α equals with a,d ∈ T terminal, and nonterminal symbols, then α[σ] denotes Using this notation, each production in P has to be of the form • A[σ] → α[σ], • A[σ] → B[fσ], or • A[fσ] → α[σ], where A, B ∈ N are nonterminal symbols, f ∈ F is an index, σ ∈ F* is a string of index symbols, and α ∈ (N ∪ T)* is a string of nonterminal and terminal symbols. Some authors write ".." instead of "σ" for the index stack in production rules; the rule of type 1, 2, and 3 then reads , and , respectively. Derivations are similar to those in a context-free grammar except for the index stack attached to each nonterminal symbol. When a production like e.g. A[σ] → B[σ]C[σ] is applied, the index stack of A is copied to both B and C. Moreover, a rule can push an index symbol onto the stack, or pop its "topmost" (i.e., leftmost) index symbol. Formally, the relation ⇒ ("direct derivation") is defined on the set (N[F*]∪T)* of "sentential forms" as follows: • If A[σ] → α[σ] is a production of type 1, then β A[φ] γ ⇒ β α[φ] γ, using the above definition. That is, the rule's left hand side's index stack φ is copied to each nonterminal of the right hand side. • If A[σ] → B[fσ] is a production of type 2, then β A[φ] γ ⇒ β B[fφ] γ. That is, the right hand side's index stack is obtained from the left hand side's stack φ by pushing f onto it. • If A[fσ] → α[σ] is a production of type 3, then β A[fφ] γ ⇒ β α[φ] γ, using again the definition of α[σ]. That is, the first index f is popped from the left hand side's stack, which is then distributed to each nonterminal of the right hand side. As usual, the derivation relation is defined as the reflexive transitive closure of direct derivation ⇒. The language L(G) = { w ∈ T*: S w } is the set of all strings of terminal symbols derivable from the start symbol. Original definition by Aho Historically, the concept of indexed grammars was first introduced by Alfred Aho (1968) using a different formalism. Aho defined an indexed grammar to be a 5-tuple (N,T,F,P,S) where • N is a finite alphabet of variables or nonterminal symbols • T is a finite alphabet of terminal symbols • F ⊆ 2N × (N ∪ T)* is the finite set of so-called flags (each flag is itself a set of so-called index productions) • P ⊆ N × (NF* ∪ T)* is the finite set of productions • S ∈ N is the start symbol Direct derivations were as follows: • A production p = (A → X1η1...''X'k'η''k) from P matches a nonterminal A ∈ N followed by its (possibly empty) string of flags ζ ∈ F*. In context, γ Aζ δ, via p, derives to γ X1θ1...''X'k'θ''k δ, where θi = ''η'i'ζ if X''i was a nonterminal and the empty word otherwise. The old flags of A are therefore copied to each new nonterminal produced by p. Each such production can be simulated by appropriate productions of type 1 and 2 in the Hopcroft/Ullman formalism. • An index production p = (A → X1...Xk) ∈ f matches Afζ (the flag f it comes from must match the first symbol following the nonterminal A) and copies the remaining index string ζ to each new nonterminal: γ Afζ δ derives to γ X1θ1...''X'k'θ''k δ, where θi is the empty word when Xi is a terminal and ζ when it is a nonterminal. Each such production corresponds to a production of type 3 in the Hopcroft/Ullman formalism. This formalism is e.g. used by Hayashi (1973, p. 65-66). ==Examples==