Collocations are partly or fully fixed expressions that become established through repeated context-dependent use. Such terms as
crystal clear,
middle management,
nuclear family, and
cosmetic surgery are examples of collocated pairs of words. Collocations can be in a
syntactic relation (such as
verb–object:
make and
decision),
lexical relation (such as
antonymy), or they can be in no linguistically defined relation. Knowledge of collocations is vital for the competent use of a language: a
grammatically correct sentence will stand out as awkward if collocational preferences are violated. This makes collocation a common focus for language teaching. Corpus linguists specify a
key word in context (
KWIC) and identify the words immediately surrounding them, to illustrate the way words are used in practice. The processing of collocations involves a number of parameters, the most important of which is the
measure of association, which evaluates whether the
co-occurrence is purely by chance or statistically
significant. Due to the non-random nature of language, most collocations are classed as significant, and the association scores are simply used to rank the results. Commonly used measures of association include
mutual information,
t scores, and
log-likelihood. Rather than select a single definition, Gledhill proposes that collocation involves at least three different perspectives: co-occurrence, a statistical view, which sees collocation as the recurrent appearance in a text of a node and its collocates; construction, which sees collocation either as a correlation between a lexeme and a lexical-grammatical pattern, or as a relation between a base and its collocative partners; and expression, a pragmatic view of collocation as a conventional unit of expression, regardless of form. These different perspectives contrast with the usual way of presenting collocation in phraseological studies. Traditionally speaking, collocation is explained in terms of all three perspectives at once, in a continuum: ==In dictionaries==