:''This article's examples are written in
Common Lisp (though most are also valid in
Scheme).''
Symbolic expressions (S-expressions) Lisp is an
expression oriented language. Unlike most other languages, no distinction is made between "expressions" and
"statements"; all code and data are written as expressions. When an expression is
evaluated, it produces a value (possibly multiple values), which can then be embedded into other expressions. Each value can be any data type. McCarthy's 1958 paper introduced two types of syntax:
Symbolic expressions (
S-expressions, sexps), which mirror the internal representation of code and data; and
Meta expressions (
M-expressions), which express functions of S-expressions. M-expressions never found favor, and almost all Lisps today use S-expressions to manipulate both code and data. The use of parentheses is Lisp's most immediately obvious difference from other programming language families. As a result, students have long given Lisp nicknames such as
Lost In Stupid Parentheses, or
Lots of Irritating Superfluous Parentheses. However, the S-expression syntax is also responsible for much of Lisp's power: the syntax is simple and consistent, which facilitates manipulation by computer. However, the syntax of Lisp is not limited to traditional parentheses notation. It can be extended to include alternative notations. For example, XMLisp is a Common Lisp extension that employs the
metaobject protocol to integrate S-expressions with the Extensible Markup Language (
XML). The reliance on expressions gives the language great flexibility. Because Lisp
functions are written as lists, they can be processed exactly like data. This allows easy writing of programs which manipulate other programs (
metaprogramming). Many Lisp dialects exploit this feature using macro systems, which enables extension of the language almost without limit.
Lists A Lisp list is written with its elements separated by
whitespace, and surrounded by parentheses. For example, is a list whose elements are the three
atoms , , and foo|. These values are implicitly typed: they are respectively two integers and a Lisp-specific data type called a "symbol", and do not have to be declared as such. The empty list is also represented as the special atom . This is the only entity in Lisp which is both an atom and a list. Expressions are written as lists, using
prefix notation. The first element in the list is the name of a function, the name of a macro, a lambda expression or the name of a "special operator" (see below). The remainder of the list are the arguments. For example, the function returns its arguments as a list, so the expression (list 1 2 (quote foo)) evaluates to the list . The "quote" before the in the preceding example is a "special operator" which returns its argument without evaluating it. Any unquoted expressions are recursively evaluated before the enclosing expression is evaluated. For example, (list 1 2 (list 3 4)) evaluates to the list . The third argument is a list; lists can be nested.
Operators Arithmetic operators are treated similarly. The expression (+ 1 2 3 4) evaluates to 10. The equivalent under
infix notation would be "". Lisp has no notion of operators as implemented in
ALGOL-derived languages. Arithmetic operators in Lisp are
variadic functions (or
n-ary), able to take any number of arguments. A C-style '++' increment operator is sometimes implemented under the name incf giving syntax (incf x) equivalent to (setq x (+ x 1)), returning the new value of x. "Special operators" (sometimes called "special forms") provide Lisp's control structure. For example, the special operator takes three arguments. If the first argument is non-nil, it evaluates to the second argument; otherwise, it evaluates to the third argument. Thus, the expression (if nil (list 1 2 "foo") (list 3 4 "bar")) evaluates to . Of course, this would be more useful if a non-trivial expression had been substituted in place of . Lisp also provides logical operators
and,
or and
not. The
and and
or operators do
short-circuit evaluation and will return their first nil and non-nil argument respectively. (or (and "zero" nil "never") "James" 'task 'time) will evaluate to "James".
Lambda expressions and function definition Another special operator, , is used to bind variables to values which are then evaluated within an expression. This operator is also used to create functions: the arguments to are a list of arguments, and the expression or expressions to which the function evaluates (the returned value is the value of the last expression that is evaluated). The expression (lambda (arg) (+ arg 1)) evaluates to a function that, when applied, takes one argument, binds it to and returns the number one greater than that argument. Lambda expressions are treated no differently from named functions; they are invoked the same way. Therefore, the expression ((lambda (arg) (+ arg 1)) 5) evaluates to . Here, we're doing a function application: we execute the
anonymous function by passing to it the value 5. Named functions are created by storing a lambda expression in a symbol using the defun macro. (defun foo (a b c d) (+ a b c d)) defines a new function named in the global environment. It is conceptually similar to the expression: (setf (fdefinition 'f) #'(lambda (a) (block f b...))) where is a macro used to set the value of the first argument to a new function object. is a global function definition for the function named . is an abbreviation for special operator, returning a function object.
Atoms In the original
LISP there were two fundamental
data types: atoms and lists. A list was a finite ordered sequence of elements, where each element is either an atom or a list, and an atom was a
number or a symbol. A symbol was essentially a unique named item, written as an
alphanumeric string in
source code, and used either as a variable name or as a data item in
symbolic processing. For example, the list contains three elements: the symbol , the list , and the number 2. The essential difference between atoms and lists was that atoms were immutable and unique. Two atoms that appeared in different places in source code but were written in exactly the same way represented the same object, whereas each list was a separate object that could be altered independently of other lists and could be distinguished from other lists by comparison operators. As more data types were introduced in later Lisp dialects, and
programming styles evolved, the concept of an atom lost importance. Many dialects still retained the predicate
atom for
legacy compatibility, defining it true for any object which is not a cons.
Conses and lists A Lisp list is implemented as a
singly linked list. Each cell of this list is called a
cons (in Scheme, a
pair) and is composed of two
pointers, called the
car and cdr. These are respectively equivalent to the and fields discussed in the article
linked list. Of the many data structures that can be built out of cons cells, one of the most basic is called a
proper list. A proper list is either the special (empty list) symbol, or a cons in which the points to a datum (which may be another cons structure, such as a list), and the points to another proper list. If a given cons is taken to be the head of a linked list, then its car points to the first element of the list, and its cdr points to the rest of the list. For this reason, the and functions are also called and when referring to conses which are part of a linked list (rather than, say, a tree). Thus, a Lisp list is not an atomic object, as an instance of a container class in C++ or Java would be. A list is nothing more than an aggregate of linked conses. A variable that refers to a given list is simply a pointer to the first cons in the list. Traversal of a list can be done by
cdring down the list; that is, taking successive cdrs to visit each cons of the list; or by using any of several
higher-order functions to map a function over a list. Because conses and lists are so universal in Lisp systems, it is a common misconception that they are Lisp's only data structures. In fact, all but the most simplistic Lisps have other data structures, such as vectors (
arrays),
hash tables, structures, and so forth.
S-expressions represent lists diagram for the list (42 69 613) Parenthesized S-expressions represent linked list structures. There are several ways to represent the same list as an S-expression. A cons can be written in
dotted-pair notation as , where is the car and the cdr. A longer proper list might be written in dotted-pair notation. This is conventionally abbreviated as in
list notation. An improper list may be written in a combination of the two – as for the list of three conses whose last cdr is (i.e., the list in fully specified form).
List-processing procedures Lisp provides many built-in procedures for accessing and controlling lists. Lists can be created directly with the procedure, which takes any number of arguments, and returns the list of these arguments. (list 1 2 'a 3) ;Output: (1 2 a 3) (list 1 '(2 3) 4) ;Output: (1 (2 3) 4) Because of the way that lists are constructed from
cons pairs, the procedure can be used to add an element to the front of a list. The procedure is asymmetric in how it handles list arguments, because of how lists are constructed. (cons 1 '(2 3)) ;Output: (1 2 3) (cons '(1 2) '(3 4)) ;Output: ((1 2) 3 4) The procedure appends two (or more) lists to one another. Because Lisp lists are linked lists, appending two lists has
asymptotic time complexity O(n) (append '(1 2) '(3 4)) ;Output: (1 2 3 4) (append '(1 2 3) '() '(a) '(5 6)) ;Output: (1 2 3 a 5 6)
Shared structure Lisp lists, being simple linked lists, can share structure with one another. That is to say, two lists can have the same
tail, or final sequence of conses. For instance, after the execution of the following Common Lisp code: (setf foo (list 'a 'b 'c)) (setf bar (cons 'x (cdr foo))) the lists and are and respectively. However, the tail is the same structure in both lists. It is not a copy; the cons cells pointing to and are in the same memory locations for both lists. Sharing structure rather than copying can give a dramatic performance improvement. However, this technique can interact in undesired ways with functions that alter lists passed to them as arguments. Altering one list, such as by replacing the with a , will affect the other: (setf (third foo) 'goose) This changes to , but thereby also changes to – a possibly unexpected result. This can be a source of bugs, and functions which alter their arguments are documented as
destructive for this very reason. Aficionados of
functional programming avoid destructive functions. In the Scheme dialect, which favors the functional style, the names of destructive functions are marked with a cautionary exclamation point, or "bang"—such as (read
set car bang), which replaces the car of a cons. In the Common Lisp dialect, destructive functions are commonplace; the equivalent of is named for "replace car". This function is rarely seen, however, as Common Lisp includes a special facility, , to make it easier to define and use destructive functions. A frequent style in Common Lisp is to write code functionally (without destructive calls) when prototyping, then to add destructive calls as an optimization where it is safe to do so.
Self-evaluating forms and quoting Lisp evaluates expressions which are entered by the user. Symbols and lists evaluate to some other (usually, simpler) expression – for instance, a symbol evaluates to the value of the variable it names; evaluates to . However, most other forms evaluate to themselves: if entering into Lisp, it returns . Any expression can also be marked to prevent it from being evaluated (as is necessary for symbols and lists). This is the role of the special operator, or its abbreviation (one quotation mark). For instance, usually if entering the symbol , it returns the value of the corresponding variable (or an error, if there is no such variable). To refer to the literal symbol, enter or, usually, . Both Common Lisp and Scheme also support the
backquote operator (termed
quasiquote in Scheme), entered with the character (
Backtick). This is almost the same as the plain quote, except it allows expressions to be evaluated and their values interpolated into a quoted list with the comma
unquote and comma-at
splice operators. If the variable has the value then evaluates to , while evaluates to . The backquote is most often used in defining macro expansions. Self-evaluating forms and quoted forms are Lisp's equivalent of literals. It may be possible to modify the values of (mutable) literals in program code. For instance, if a function returns a quoted form, and the code that calls the function modifies the form, this may alter the behavior of the function on subsequent invocations. (defun should-be-constant () '(one two three)) (let ((stuff (should-be-constant))) (setf (third stuff) 'bizarre)) ; bad! (should-be-constant) ; returns (one two bizarre) Modifying a quoted form like this is generally considered bad style, and is defined by ANSI Common Lisp as erroneous (resulting in "undefined" behavior in compiled files, because the file-compiler can coalesce similar constants, put them in write-protected memory, etc.). Lisp's formalization of quotation has been noted by
Douglas Hofstadter (in
Gödel, Escher, Bach) and others as an example of the
philosophical idea of
self-reference.
Scope and closure The Lisp family splits over the use of
dynamic or
static (a.k.a. lexical)
scope. Clojure, Common Lisp and Scheme make use of static scoping by default, while
newLISP,
Picolisp and the embedded languages in
Emacs and
AutoCAD use dynamic scoping. Since version 24.1, Emacs uses both dynamic and lexical scoping.
List structure of program code; exploitation by macros and compilers A fundamental distinction between Lisp and other languages is that in Lisp, the textual representation of a program is simply a human-readable description of the same internal data structures (linked lists, symbols, number, characters, etc.) as would be used by the underlying Lisp system. Lisp uses this to implement a very powerful macro system. Like other macro languages such as the one defined by the
C preprocessor (the macro preprocessor for the
C,
Objective-C and
C++ programming languages), a macro returns code that can then be compiled. However, unlike C preprocessor macros, the macros are Lisp functions and so can exploit the full power of Lisp. Further, because Lisp code has the same structure as lists, macros can be built with any of the list-processing functions in the language. In short, anything that Lisp can do to a data structure, Lisp macros can do to code. In contrast, in most other languages, the parser's output is purely internal to the language implementation and cannot be manipulated by the programmer. This feature makes it easy to develop
efficient languages within languages. For example, the Common Lisp Object System can be implemented cleanly as a language extension using macros. This means that if an application needs a different inheritance mechanism, it can use a different object system. This is in stark contrast to most other languages; for example, Java does not support multiple inheritance and there is no reasonable way to add it. In simplistic Lisp implementations, this list structure is directly
interpreted to run the program; a function is literally a piece of list structure which is traversed by the interpreter in executing it. However, most substantial Lisp systems also include a compiler. The compiler translates list structure into machine code or
bytecode for execution. This code can run as fast as code compiled in conventional languages such as C. Macros expand before the compilation step, and thus offer some interesting options. If a program needs a precomputed table, then a macro might create the table at compile time, so the compiler need only output the table and need not call code to create the table at run time. Some Lisp implementations even have a mechanism, eval-when, that allows code to be present during compile time (when a macro would need it), but not present in the emitted module.
Evaluation and the read–eval–print loop Lisp languages are often used with an interactive
command line, which may be combined with an
integrated development environment (IDE). The user types in expressions at the command line, or directs the IDE to transmit them to the Lisp system. Lisp
reads the entered expressions,
evaluates them, and
prints the result. For this reason, the Lisp command line is called a
read–eval–print loop (
REPL). The basic operation of the REPL is as follows. This is a simplistic description which omits many elements of a real Lisp, such as quoting and macros. The function accepts textual S-expressions as input, and parses them into an internal data structure. For instance, if you type the text at the prompt, translates this into a linked list with three elements: the symbol , the number 1, and the number 2. It so happens that this list is also a valid piece of Lisp code; that is, it can be evaluated. This is because the car of the list names a function—the addition operation. A will be read as a single symbol. will be read as the number one hundred and twenty-three. will be read as the string "123". The function evaluates the data, returning zero or more other Lisp data as a result. Evaluation does not have to mean interpretation; some Lisp systems compile every expression to native machine code. It is simple, however, to describe evaluation as interpretation: To evaluate a list whose car names a function, first evaluates each of the arguments given in its cdr, then applies the function to the arguments. In this case, the function is addition, and applying it to the argument list yields the answer . This is the result of the evaluation. The symbol evaluates to the value of the symbol foo. Data like the string "123" evaluates to the same string. The list evaluates to the list (1 2 3). It is the job of the function to represent output to the user. For a simple result such as this is trivial. An expression which evaluated to a piece of list structure would require that traverse the list and print it out as an S-expression. To implement a Lisp REPL, it is necessary only to implement these three functions and an infinite-loop function. (Naturally, the implementation of will be complex, since it must also implement all special operators like or .) This done, a basic REPL is one line of code: . The Lisp REPL typically also provides input editing, an input history, error handling and an interface to the debugger. Lisp is usually evaluated
eagerly. In
Common Lisp, arguments are evaluated in
applicative order ('leftmost innermost'), while in
Scheme order of arguments is undefined, leaving room for optimization by a compiler.
Control structures Lisp originally had very few control structures, but many more were added during the language's evolution. (Lisp's original conditional operator, , is the precursor to later structures.) Programmers in the Scheme dialect often express loops using
tail recursion. Scheme's commonality in academic computer science has led some students to believe that tail recursion is the only, or the most common, way to write iterations in Lisp, but this is incorrect. All oft-seen Lisp dialects have imperative-style iteration constructs, from Scheme's loop to
Common Lisp's complex expressions. Moreover, the key issue that makes this an objective rather than subjective matter is that Scheme makes specific requirements for the handling of
tail calls, and thus the reason that the use of tail recursion is generally encouraged for Scheme is that the practice is expressly supported by the language definition. By contrast, ANSI Common Lisp does not require the optimization commonly termed a tail call elimination. Thus, the fact that tail recursive style as a casual replacement for the use of more traditional
iteration constructs (such as , or ) is discouraged in Common Lisp is not just a matter of stylistic preference, but potentially one of efficiency (since an apparent tail call in Common Lisp may not compile as a simple
jump) and program correctness (since tail recursion may increase stack use in Common Lisp, risking
stack overflow). Some Lisp control structures are
special operators, equivalent to other languages' syntactic keywords. Expressions using these operators have the same surface appearance as function calls, but differ in that the arguments are not necessarily evaluated—or, in the case of an iteration expression, may be evaluated more than once. In contrast to most other major programming languages, Lisp allows implementing control structures using the language. Several control structures are implemented as Lisp macros, and can even be macro-expanded by the programmer who wants to know how they work. Both Common Lisp and Scheme have operators for non-local control flow. The differences in these operators are some of the deepest differences between the two dialects. Scheme supports
re-entrant continuations using the procedure, which allows a program to save (and later restore) a particular place in execution. Common Lisp does not support re-entrant continuations, but does support several ways of handling escape continuations. Often, the same algorithm can be expressed in Lisp in either an imperative or a functional style. As noted above, Scheme tends to favor the functional style, using tail recursion and continuations to express control flow. However, imperative style is still quite possible. The style preferred by many Common Lisp programmers may seem more familiar to programmers used to structured languages such as C, while that preferred by Schemers more closely resembles pure-functional languages such as
Haskell. Because of Lisp's early heritage in list processing, it has a wide array of higher-order functions relating to iteration over sequences. In many cases where an explicit loop would be needed in other languages (like a loop in C) in Lisp the same task can be accomplished with a higher-order function. (The same is true of many functional programming languages.) A good example is a function which in Scheme is called and in Common Lisp is called . Given a function and one or more lists, applies the function successively to the lists' elements in order, collecting the results in a new list: (mapcar #'+ '(1 2 3 4 5) '(10 20 30 40 50)) This applies the function to each corresponding pair of list elements, yielding the result . == Examples ==