Program representation GP evolves computer programs, traditionally represented in memory as
tree structures. Trees can be easily evaluated in a recursive manner. Every internal node has an operator function and every terminal node has an operand, making mathematical expressions easy to evolve and evaluate. Thus traditionally GP favors the use of
programming languages that naturally embody tree structures (for example,
Lisp; other
functional programming languages are also suitable). Non-tree representations have been suggested and successfully implemented, such as
linear genetic programming, which perhaps suits the more traditional
imperative languages. The commercial GP software
Discipulus uses automatic induction of binary machine code ("AIM") to achieve better performance.
μGP uses
directed multigraphs to generate programs that fully exploit the syntax of a given
assembly language.
Multi expression programming uses
three-address code for encoding solutions. Other program representations on which significant research and development have been conducted include programs for stack-based virtual machines, and sequences of integers that are mapped to arbitrary programming languages via grammars.
Cartesian genetic programming is another form of GP, which uses a graph representation instead of the usual tree based representation to encode computer programs. Most representations have structurally noneffective code (
introns). Such non-coding genes may seem to be useless because they have no effect on the performance of any one individual. However, they alter the probabilities of generating different offspring under the variation operators, and thus alter the individual's
variational properties. Experiments seem to show faster convergence when using program representations that allow such non-coding genes, compared to program representations that do not have any non-coding genes. Instantiations may have both trees with introns and those without; the latter are called canonical trees. Special canonical crossover operators are introduced that maintain the canonical structure of parents in their children.
Initialisation The methods for creation of the initial population include: •
Grow creates the individuals sequentially. Every GP tree is created starting from the root, creating functional nodes with children as well as terminal nodes up to a certain depth. •
Full is similar to the
Grow. The difference is that all brunches in a tree are of same predetermined depth. •
Ramped half-and-half creates a population consisting of md-1 parts and a maximum depth of md for its trees. The first part has a maximum depth of 2, second of 3 and so on up to the md-1-th part with maximum depth md. Half of every part is created by
Grow, while the other part is created by
Full.
Selection Selection is a process whereby certain individuals are selected from the current generation that would serve as parents for the next generation. The individuals are selected probabilistically such that the better performing individuals have a higher chance of getting selected. and others have been demonstrated to perform better for many GP problems. Elitism, which involves seeding the next generation with the best individual (or best
n individuals) from the current generation, is a technique sometimes employed to avoid regression.
Crossover In genetic programming two fit individuals are chosen from the population to be parents for one or two children. In tree genetic programming, these parents are represented as inverted lisp like trees, with their root nodes at the top. In subtree crossover in each parent a subtree is randomly chosen. (Highlighted with yellow in the animation.) In the root donating parent (in the animation on the left) the chosen subtree is removed and replaced with a copy of the randomly chosen subtree from the other parent, to give a new child tree. Sometimes two child crossover is used, in which case the removed subtree (in the animation on the left) is not simply deleted but is copied to a copy of the second parent (here on the right) replacing (in the copy) its randomly chosen subtree. Thus this type of subtree crossover takes two fit trees and generates two child trees. The tree-based approach in genetic programming also shares structural and procedural similarities with earlier knowledge-based and topology-oriented crossover methods. Specifically, analogous crossover and homologous crossover, both implemented in robot trajectory planning, exhibit a resemblance to subtree operations in tree GP. These crossover mechanisms were described in the context of heuristic optimisation strategies in robotics.
Replication Some individuals selected according to fitness criteria do not participate in crossover, but are copied into the next generation, akin to asexual reproduction in the natural world. They may be further subject to mutation.
Mutation There are many types of mutation in genetic programming. They start from a fit syntactically correct parent and aim to randomly create a syntactically correct child. In the animation a subtree is randomly chosen (highlighted by yellow). It is removed and replaced by a randomly generated subtree. Other mutation operators select a leaf (external node) of the tree and replace it with a randomly chosen leaf. Another mutation is to select at random a function (internal node) and replace it with another function with the same arity (number of inputs). Hoist mutation randomly chooses a subtree and replaces it with a subtree within itself. Thus hoist mutation is guaranteed to make the child smaller. Leaf and same arity function replacement ensure the child is the same size as the parent. Whereas subtree mutation (in the animation) may, depending upon the function and terminal sets, have a bias to either increase or decrease the tree size. Other subtree based mutations try to carefully control the size of the replacement subtree and thus the size of the child tree. Similarly there are many types of linear genetic programming mutation, each of which tries to ensure the mutated child is still syntactically correct. ==Applications==