Data representation As
numerical software is highly efficient for approximate
numerical computation, it is common, in computer algebra, to emphasize
exact computation with exactly represented data. Such an exact representation implies that, even when the size of the output is small, the intermediate data generated during a computation may grow in an unpredictable way. This behavior is called
expression swell. To alleviate this problem, various methods are used in the representation of the data, as well as in the algorithms that manipulate them.
Numbers The usual number systems used in
numerical computation are
floating point numbers and
integers of a fixed, bounded size. Neither of these is convenient for computer algebra, due to expression swell. Therefore, the basic numbers used in computer algebra are the integers of the mathematicians, commonly represented by an unbounded signed sequence of
digits in some
base of numeration, usually the largest base allowed by the
machine word. These integers allow one to define the
rational numbers, which are
irreducible fractions of two integers. Programming an efficient implementation of the arithmetic operations is a hard task. Therefore, most free
computer algebra systems, and some commercial ones such as
Mathematica and
Maple, use the
GMP library, which is thus a
de facto standard.
Expressions tree, from a 1985 Master's Thesis Except for
numbers and
variables, every
mathematical expression may be viewed as the symbol of an operator followed by a
sequence of operands. In computer-algebra software, the expressions are usually represented in this way. This representation is very flexible, and many things that seem not to be mathematical expressions at first glance, may be represented and manipulated as such. For example, an
equation may be considered as an expression with "=" as its leading operator, and a matrix may be represented as an expression with "matrix" as an operator and its rows as operands. Even programs may be considered and represented as expressions with operator "procedure" and, at least, two operands, the list of parameters and the body, which is itself an expression with "body" as an operator and a sequence of instructions as operands. Conversely, any mathematical expression may be viewed as a program. For example, the expression may be viewed as a program for the addition, with and as parameters. Executing this program consists of
evaluating the expression for given values of and ; if they are not given any values, then the result of the evaluation is simply its input. This process of delayed evaluation is fundamental in computer algebra. For example, the operator "=" of equation is also, in most computer algebra systems, the name of the program of the equality test: normally, the evaluation of an equation results in an equation, but, when an equality test is needed, either explicitly asked by the user through an "evaluation to a Boolean" command, or automatically started by the system in the case of a test inside a program, then the evaluation to a Boolean result is executed. As the size of the operands of an expression is unpredictable and may change during a working session, the sequence of the operands is usually represented as a sequence of either
pointers (like in
Macsyma) or entries in a
hash table (like in
Maple).
Simplification The raw application of the basic rules of
differentiation with respect to on the expression gives the result : x\cdot a^{x-1}\cdot 0 + a^x\cdot \left (1\cdot \log a + x\cdot \frac{0}{a} \right). A simpler expression than this is generally desired, and simplification is needed when working with general expressions. This simplification is normally done through
rewriting rules. There are several classes of rewriting rules to be considered. The simplest are rules that always reduce the size of the expression, like or . They are systematically applied in computer algebra systems. A difficulty occurs with
associative operations like addition and multiplication. The standard way to deal with associativity is to consider that addition and multiplication have an arbitrary number of operands; that is, that is represented as . Thus and are both simplified to , which is displayed . In the case of expressions such as , the simplest way is to systematically rewrite , , as, respectively, , , . In other words, in the internal representation of the expressions, there is no subtraction nor division nor unary minus, outside the representation of the numbers. Another difficulty occurs with the
commutativity of addition and multiplication. The problem is to quickly recognize the
like terms in order to combine or cancel them. Testing every pair of terms is costly with very long sums and products. To address this,
Macsyma sorts the operands of sums and products into an order that places like terms in consecutive places, allowing easy detection. In
Maple, a
hash function is designed for generating collisions when like terms are entered, allowing them to be combined as soon as they are introduced. This allows subexpressions that appear several times in a computation to be immediately recognized and stored only once. This saves memory and speeds up computation by avoiding repetition of the same operations on identical expressions. Some rewriting rules sometimes increase and sometimes decrease the size of the expressions to which they are applied. This is the case for the
distributive law or
trigonometric identities. For example, the distributive law allows rewriting (x+1)^4 \rightarrow x^4+4x^3+6x^2+4x+1 and (x-1)(x^4+x^3+x^2+x+1) \rightarrow x^5-1. As there is no way to make a good general choice of applying or not such a rewriting rule, such rewriting is done only when explicitly invoked by the user. For the distributive law, the computer function that applies this rewriting rule is typically called "expand". The reverse rewriting rule, called "factor", requires a non-trivial algorithm, which is thus a key function in computer algebra systems (see
Polynomial factorization). == Mathematical aspects ==