The type system can be formally described by
syntax rules that fix a language for the expressions, types, etc. The presentation here of such a syntax is not too formal, in that it is written down not to study the
surface grammar, but rather the
depth grammar, and leaves some syntactical details open. This form of presentation is usual. Building on this,
typing rules are used to define how expressions and types are related. As before, the form used is a bit liberal.
Syntax The expressions to be typed are exactly those of the
lambda calculus extended with a let-expression as shown in the adjacent table. Parentheses can be used to disambiguate an expression. The application is left-binding and binds stronger than abstraction or the let-in construct. Types are syntactically split into two groups, monotypes and polytypes.
Monotypes Monotypes always designate a particular type. Monotypes \tau are syntactically represented as
terms. Examples of monotypes include
type constants like \mathtt{int} or \mathtt{string}, and parametric types like \mathtt{Map\ (Set\ string)\ int}. The latter types are examples of
applications of
type functions, for example, from the set \{ \mathtt{Map^2,\ Set^1,\ string^0,\ int^0},\ \rightarrow^2 \} , where the superscript indicates the number of type parameters. The complete set of type functions C is arbitrary in HM, except that it
must contain at least \rightarrow^2, the type of functions. It is often written in
infix notation for convenience. For example, a function mapping integers to strings has type \mathtt{int}\rightarrow \mathtt{string}. Again, parentheses can be used to disambiguate a type expression. The application binds stronger than the infix arrow, which is right-binding. Type variables are admitted as monotypes. Monotypes are not to be confused with monomorphic types, which exclude variables and allow only ground terms. Two monotypes are equal if they have identical terms.
Polytypes Polytypes (or
type schemes) are types containing variables bound by zero or more for-all quantifiers, e.g. \forall\alpha.\alpha\rightarrow\alpha. A function with polytype \forall\alpha.\alpha\rightarrow\alpha can map
any value of the same type to itself, and the
identity function is a value for this type. As another example, \forall\alpha.(\mathtt{Set}\ \alpha)\rightarrow \mathtt{int} is the type of a function mapping all finite sets to integers. A function which returns the
cardinality of a set would be a value of this type. Quantifiers can only appear top level. For instance, a type \forall\alpha.\alpha\rightarrow\forall\alpha.\alpha is excluded by the syntax of types. Also monotypes are included in the polytypes, thus a type has the general form \forall\alpha_1\dots\forall\alpha_n.\tau, where n\geq0 and \tau is a monotype. Equality of polytypes is up to reordering the quantification and renaming the quantified variables (\alpha-conversion). Further, quantified variables not occurring in the monotype can be dropped.
Context and typing To meaningfully bring together the still disjoint parts (syntax expressions and types) a third part is needed: context. Syntactically, a context is a list of pairs x:\sigma, called
assignments,
assumptions or
bindings, each pair stating that value variable x_ihas type \sigma_i. All three parts combined give a
typing judgment of the form \Gamma\ \vdash\ e:\sigma, stating that under assumptions \Gamma, the expression e has type \sigma.
Free type variables In a type \forall\alpha_1\dots\forall\alpha_n.\tau, the symbol \forall is the quantifier binding the type variables \alpha_i in the monotype \tau. The variables \alpha_i are called
quantified and any occurrence of a quantified
type variable in \tau is called
bound and all unbound type variables in \tau are called
free. Additionally to the quantification \forall in polytypes, type variables can also be bound by occurring in the context, but with the inverse effect on the right hand side of the \vdash. Such variables then behave like type constants there. Finally, a type variable may legally occur unbound in a typing, in which case they are implicitly all-quantified. The presence of both bound and unbound type variables is a bit uncommon in programming languages. Often, all type variables are implicitly treated all-quantified. For instance, one does not have clauses with free variables in
Prolog. Likewise in Haskell, where all type variables implicitly occur quantified, i.e. a Haskell type a -> a means \forall\alpha.\alpha\rightarrow\alpha here. Related and also very uncommon is the binding effect of the right hand side \sigma of the assignments. Typically, the mixture of both bound and unbound type variables originate from the use of free variables in an expression. The
constant function K = \lambda x.\lambda y. x provides an example. It has the monotype \alpha\rightarrow\beta\rightarrow\alpha. One can force polymorphism by \mathbf{let}\ k = \lambda x. (\mathbf{let}\ f = \lambda y.x\ \mathbf{in}\ f)\ \mathbf{in}\ k. Herein, f has the type \forall \gamma.\gamma\rightarrow\alpha. The free monotype variable \alpha originates from the type of the variable x bound in the surrounding scope. k has the type \forall\alpha\forall\beta.\alpha\rightarrow\beta\rightarrow\alpha. One could imagine the free type variable \alpha in the type of f be bound by the \forall\alpha in the type of k. But such a scoping cannot be expressed in HM. Rather, the binding is realized by the context.
Type order Polymorphism means that one and the same expression can have (perhaps infinitely) many types. But in this type system, these types are not completely unrelated, but rather orchestrated by the parametric polymorphism. As an example, the identity \lambda x . x can have \forall \alpha . \alpha \rightarrow \alpha as its type as well as \texttt{string} \rightarrow \texttt{string} or \texttt{int} \rightarrow \texttt{int} and many others, but not \texttt{int} \rightarrow \texttt{string}. The most general type for this function is \forall \alpha . \alpha\rightarrow \alpha, while the others are more specific and can be derived from the general one by consistently replacing another type for the
type parameter, i.e. the quantified variable \alpha. The counter-example fails because the replacement is not consistent. The consistent replacement can be made formal by applying a
substitution S = \left\{\ a_i \mapsto \tau_i,\ \dots\ \right\} to the term of a type \tau, written S\tau. As the example suggests, substitution is not only strongly related to an order, that expresses that a type is more or less special, but also with the all-quantification which allows the substitution to be applied. Formally, in HM, a type \sigma' is more general than \sigma, formally \sigma' \sqsubseteq \sigma, if some quantified variable in \sigma' is consistently substituted such that one gains \sigma as shown in the side bar. This order is part of the type definition of the type system. In our previous example, applying the substitution S = \left\{\alpha \mapsto \texttt{string} \right\} would result in \forall \alpha . \alpha \rightarrow \alpha \sqsubseteq \texttt{string} \rightarrow \texttt{string}. While substituting a monomorphic (ground) type for a quantified variable is straight forward, substituting a polytype has some pitfalls caused by the presence of free variables. Most particularly, unbound variables must not be replaced. They are treated as constants here. Additionally, quantifications can only occur top-level. Substituting a parametric type, one has to lift its quantifiers. The table on the right makes the rule precise. Alternatively, consider an equivalent notation for the polytypes without quantifiers in which quantified variables are represented by a different set of symbols. In such a notation, the specialization reduces to plain consistent replacement of such variables. The relation \sqsubseteq is a
partial order and \forall \alpha . \alpha is its smallest element.
Principal type While specialization of a type scheme is one use of the order, it plays a crucial second role in the type system. Type inference with polymorphism faces the challenge of summarizing all possible types an expression may have. The order guarantees that such a summary exists as the most general type of the expression.
Substitution in typings The type order defined above can be extended to typings because the implied all-quantification of typings enables consistent replacement: : \Gamma \vdash e : \sigma \quad\Longrightarrow\quad S\Gamma \vdash e : S\sigma Contrary to the specialisation rule, this is not part of the definition, but like the implicit all-quantification rather a consequence of the type rules defined next. Free type variables in a typing serve as placeholders for possible refinement. The binding effect of the environment to free type variables on the right hand side of \vdash that prohibits their substitution in the specialisation rule is again that a replacement has to be consistent and would need to include the whole typing. This article will discuss four different rule sets: :# \vdash_D declarative system :# \vdash_S syntactical system :# \vdash_J algorithm J :# \vdash_W algorithm W
Deductive system The syntax of HM is carried forward to the syntax of the
inference rules that form the body of the
formal system, by using the typings as
judgments. Each of the rules define what conclusion could be drawn from what premises. Additionally to the judgments, some extra conditions introduced above might be used as premises, too. A proof using the rules is a sequence of judgments such that all premises are listed before a conclusion. The examples below show a possible format of proofs. From left to right, each line shows the conclusion, the [\mathtt{Name}] of the rule applied and the premises, either by referring to an earlier line (number) if the premise is a judgment or by making the predicate explicit.
Typing rules :
See also Typing rules The side box shows the deduction rules of the HM type system. One can roughly divide the rules into two groups: The first four rules [\mathtt{Var}] (variable or function access), [\mathtt{App}] (
application, i.e. function call with one parameter), [\mathtt{Abs}] (
abstraction, i.e. function declaration) and [\mathtt{Let}] (variable declaration) are centered around the syntax, presenting one rule for each of the expression forms. Their meaning is obvious at the first glance, as they decompose each expression, prove their sub-expressions and finally combine the individual types found in the premises to the type in the conclusion. The second group is formed by the remaining two rules [\mathtt{Inst}] and [\mathtt{Gen}]. They handle specialization and generalization of types. While the rule [\mathtt{Inst}] should be clear from the section on specialization
above, [\mathtt{Gen}] complements the former, working in the opposite direction. It allows generalization, i.e. to quantify monotype variables not bound in the context. The following two examples exercise the rule system in action. Since both the expression and the type are given, they are a type-checking use of the rules.
Example: A proof for \Gamma \vdash_D id(n):int where \Gamma = id:\forall \alpha . \alpha\rightarrow\alpha,\ n:int, could be written : \begin{array}{llll} 1:&\Gamma \vdash_D id : \forall\alpha.\alpha \rightarrow \alpha &[\mathtt{Var}]& (id : \forall\alpha.\alpha \rightarrow \alpha \in \Gamma) \\ 2:&\Gamma \vdash_D id : int \rightarrow int & [\mathtt{Inst}]&(1),\ (\forall\alpha.\alpha \rightarrow \alpha \sqsubseteq int\rightarrow int)\\ 3:&\Gamma \vdash_D n : int&[\mathtt{Var}]&(n : int \in \Gamma)\\ 4:&\Gamma \vdash_D id(n) : int&[\mathtt{App}]& (2),\ (3)\\ \end{array}
Example: To demonstrate generalization, \vdash_D\ \textbf{let}\, id = \lambda x . x\ \textbf{in}\ id\, :\, \forall\alpha.\alpha\rightarrow\alpha is shown below: : \begin{array}{llll} 1: & x:\alpha \vdash_D x : \alpha & [\mathtt{Var}] & (x:\alpha \in \left\{x:\alpha\right\})\\ 2: & \vdash_D \lambda x.x : \alpha\rightarrow\alpha & [\mathtt{Abs}] & (1)\\ 3: & id:\alpha\rightarrow\alpha \vdash_D id : \alpha\rightarrow\alpha & [\mathtt{Var}] & (id:\alpha\rightarrow\alpha \in \left\{id : \alpha\rightarrow\alpha\right\})\\ 4: & \vdash_D \textbf{let}\, id = \lambda x . x\ \textbf{in}\ id\, :\,\alpha\rightarrow\alpha & [\mathtt{Let}] & (2),\ (3)\\ 5: & \vdash_D \textbf{let}\, id = \lambda x . x\ \textbf{in}\ id\, :\,\forall\alpha.\alpha\rightarrow\alpha & [\mathtt{Gen}] & (4),\ (\alpha \not\in free(\epsilon))\\ \end{array}
Let-polymorphism Not visible immediately, the rule set encodes a regulation under which circumstances a type might be generalized or not by a slightly varying use of mono- and polytypes in the rules [\mathtt{Abs}] and [\mathtt{Let}]. Remember that \sigma and \tau denote poly- and monotypes respectively. In rule [\mathtt{Abs}], the value variable of the parameter of the function \lambda x.e is added to the context with a monomorphic type through the premise \Gamma,\ x:\tau \vdash_D e:\tau', while in the rule [\mathtt{Let}], the variable enters the environment in polymorphic form \Gamma,\ x:\sigma \vdash_D e_1:\tau. Though in both cases the presence of x in the context prevents the use of the generalisation rule for any free variable in the assignment, this regulation forces the type of parameter x in a \lambda-expression to remain monomorphic, while in a let-expression, the variable could be introduced polymorphic, making specializations possible. As a consequence of this regulation, \lambda f.(f\, \textrm{true}, f\, \textrm{0}) cannot be typed, since the parameter f is in a monomorphic position, while \textbf{let}\ f = \lambda x . x\, \textbf{in}\, (f\, \textrm{true}, f\, \textrm{0}) has type (bool, int), because f has been introduced in a let-expression and is treated polymorphic therefore.
Generalization rule The generalisation rule is also worth a closer look. Here, the all-quantification implicit in the premise \Gamma \vdash_D e : \sigma is simply moved to the right hand side of \vdash_D in the conclusion, bound by an explicit universal quantifier. This is possible, since \alpha does not occur free in the context. Again, while this makes the generalization rule plausible, it is not really a consequence. On the contrary, the generalization rule is part of the definition of HM's type system and the implicit all-quantification a consequence. == An inference algorithm ==