The earliest Schorre metacompilers, META I and META II, were developed by D. Val Schorre at UCLA. Other Schorre based metacompilers followed. Each adding improvements to language analysis and/or code generation. In programming it is common to use the programming language name to refer to both the compiler and the programming language, the context distinguishing the meaning. A C++ program is compiled using a C++ compiler. That also applies in the following. For example, META II is both the compiler and the language. The metalanguages in the Schorre line of metacompilers are functional programming languages that use top down grammar analyzing syntax equations having embedded output transformation constructs. A syntax equation: = ; is a compiled
test function returning
success or
failure. is the function name. is a form of logical expression consisting of tests that may be grouped, have alternates, and output productions. A
test is like a
bool in other languages,
success being
true and
failure being
false. Defining a programming language analytically top down is natural. For example, a program could be defined as: program = $declaration; Defining a program as a sequence of zero or more declaration(s). In the Schorre META
X languages there is a driving rule. The program rule above is an example of a driving rule. The program rule is a
test function that calls declaration, a
test rule, that returns
success or
failure. The $ loop operator repeatedly calling declaration until
failure is returned. The $ operator is always successful, even when there are zero declaration. Above program would always return success. (In CWIC a long fail can bypass declaration. A long-fail is part of the backtracking system of CWIC) The character sets of these early compilers were limited. The character
/ was used for the alternant (or) operator. "A or B" is written as A / B. Parentheses ( ) are used for grouping. A (B / C) Describes a construct of A followed by B or C. As a Boolean expression it would be A
and (B
or C) A sequence X Y has an implied X
and Y meaning.
( ) are grouping and
/ the
or operator. The order of evaluation is always left to right as an input character sequence is being specified by the ordering of the tests. Special operator words whose first character is a "." are used for clarity. .EMPTY is used as the last alternate when no previous alternant need be present. X (A / B / .EMPTY) Indicates that X is optionally followed by A
or B. This is a specific characteristic of these metalanguages being programming languages. Backtracking is avoided by the above. Other compiler constructor systems may have declared the three possible sequences and left it up to the parser to figure it out. The characteristics of the metaprogramming metalanguages above are common to all Schorre metacompilers and those derived from them.
META I META I was a hand compiled metacompiler used to compile META II. Little else is known of META I except that the initial compilation of META II produced nearly identical code to that of the hand coded META I compiler.
META II Each rule consists optionally of tests, operators, and output productions. A rule attempts to match some part of the input program source character stream returning success or failure. On success the input is advanced over matched characters. On failure the input is not advanced. Output productions produced a form of assembly code directly from a syntax rule.
META III META III is an evolution of META II, developed by Frederik W Schneider and Glen D Johnson. Matched identifiers, strings, digits, etc are pushed into a operand stack, and new operators are added to manipulate it as needed. An explicit FIFO queue, when used together with the operand stack, enables intricate manipulations of the operands, such as conversion from depth-first to breath-first. There is a explicit symbol table, together with operators to edit and query it. Finally, there is a concept of register tracking indented to facilitate generation of optimal register manipulation sequences.
TREE-META TREE-META introduced tree building operators
: and
[] moving the output production transforms to unparsed rules. The tree building operators were used in the grammar rules directly transforming the input into an
abstract syntax tree. Unparse rules are also test functions that matched tree patterns. Unparse rules are called from a grammar rule when an abstract syntax tree is to be transformed into output code. The building of an abstract syntax tree and unparse rules allowed local optimizations to be performed by analyzing the parse tree. Moving of output productions to the unparse rules made a clear separation of grammar analysis and code production. This made the programming easier to read and understand.
CWIC In 1968–1970, Erwin Book, Dewey Val Schorre, and Steven J. Sherman developed CWIC. (Compiler for Writing and Implementing Compilers) at
System Development Corporation Charles Babbage Institute Center for the History of Information Technology (Box 12, folder 21), CWIC is a compiler development system composed of three special-purpose, domain specific, languages, each intended to permit the description of certain aspects of translation in a straight forward manner. The syntax language is used to describe the recognition of source text and the construction from it to an intermediate
tree structure. The generator language is used to describe the transformation of the tree into appropriate object language. The syntax language follows Dewey Val Schorre's previous line of metacompilers. It most resembles TREE-META having
tree building operators in the syntax language. The unparse rules of TREE-META are extended to work with the object based generator language based on
LISP 2. CWIC includes three languages: •
Syntax: Transforms the source program input, into list structures using grammar transformation formula. A parsed expression structure is passed to a generator by placement of a generator call in a rule. A
tree is represented by a list whose first element is a node object. The language has operators, '
, specifically for making lists. The colon : operator is used to create node objects. :ADD creates an ADD node. The exclamation !' operator combines a number of parsed entries with a node to make a tree. Trees created by syntax rules are passed to generator functions, returning success or failure. The syntax language is very close to TREE-META. Both use a colon to create a node. CWIC's tree building exclamation ! functions the same as TREE-META's []. •
Generator: a named series of transforming rules, each consisting of an unparse, pattern matching, rule. and an output production written in a LISP 2 like language. the translation was to IBM 360 binary machine code. Other facilities of the generator language generalized output. •
MOL-360: an independent
mid level implementation language for the IBM System/360 family of computers developed in 1968 and used for writing the underlying support library.
Generators language Generators Language had semantics similar to
Lisp. The parse
tree was thought of as a recursive list. The general form of a Generator Language function is: function-name(first-unparse_rule) => first-production_code_generator (second-unparse_rule) => second-production_code_generator (third-unparse_rule) => third-production_code_generator ... The code to process a given
tree included the features of a general purpose programming language, plus a form: <stuff>, which would emit (stuff) onto the output file. A generator call may be used in the unparse_rule. The generator is passed the element of unparse_rule pattern in which it is placed and its return values are listed in (). For example: expr_gen(ADD[expr_gen(x),expr_gen(y)]) => <AR + (x*16)+y;> releasereg(y); return x; (SUB[expr_gen(x),expr_gen(y)])=> <SR + (x*16)+y;> releasereg(y); return x; (MUL[expr_gen(x),expr_gen(y)])=> . . . (x)=> r1 = getreg(); load(r1, x); return r1; ... That is, if the parse
tree looks like (ADD[,]), expr_gen(x) would be called with and return x. A variable in the unparse rule is a local variable that can be used in the production_code_generator. expr_gen(y) is called with and returns y. Here is a generator call in an unparse rule is passed the element in the position it occupies. Hopefully in the above x and y will be registers on return. The last transforms is intended to load an atomic into a register and return the register. The first production would be used to generate the 360 "AR" (Add Register) instruction with the appropriate values in general registers. The above example is only a part of a generator. Every generator expression evaluates to a value that con then be further processed. The last transform could just as well have been written as: (x)=> return load(getreg(), x); In this case load returns its first parameter, the register returned by getreg(). the functions load and getreg are other CWIC generators. ==== CWIC addressed
domain-specific languages before the term
domain-specific language existed ==== From the authors of CWIC: "A metacompiler assists the task of compiler-building by automating its non creative aspects, those aspects that are the same regardless of the language which the produced compiler is to translate. This makes possible the design of languages which are appropriate to the specification of a particular problem. It reduces the cost of producing processors for such languages to a point where it becomes economically feasible to begin the solution of a problem with language design." ==Examples==