The form of a word that is chosen to serve as the lemma is usually the least
marked form, but there are several exceptions such as the use of the infinitive for verbs in some languages. For English, the citation form of a
noun is the
singular (and non-possessive) form:
mouse rather than
mice. For multiword lexemes that contain
possessive adjectives or
reflexive pronouns, the citation form uses a form of the
indefinite pronoun one: ''do one's best
, perjure oneself''. In European languages with
grammatical gender, the citation form of regular adjectives and nouns is usually the masculine singular. If the language also has
cases, the citation form is often the masculine singular nominative. For many languages, the citation form of a
verb is the
infinitive:
French '
, German ',
Hindustani /,
Spanish ''
. English verbs usually have an infinitive, which in its bare form (without the particle to
) is its least marked (for example, break
is chosen over to break
, breaks
, broke
, breaking
, and broken
); for defective verbs with no infinitive the present tense is used (for example, must
has only one form while shall'' has no infinitive, and both lemmas are their lexemes' present tense forms). For
Latin,
Ancient Greek,
Modern Greek, and
Bulgarian, the first person singular
present tense is traditionally used, but some modern dictionaries use the infinitive instead (except for Bulgarian, which lacks infinitives; for
contracted verbs in Ancient Greek, an uncontracted first person singular present tense is used to reveal the contract vowel:
philéō for
philō "I love" [implying affection],
agapáō for
agapō "I love" [implying regard]).
Finnish dictionaries list verbs not under their root, but under the first infinitive, marked with
-(t)a,
-(t)ä. For
Japanese, the non-past (present and future) tense is used. For
Arabic the third-person singular masculine of the past/perfect tense is the least-marked form and is used for entries in modern dictionaries. In older dictionaries, which are still commonly used, the
triliteral of the word, either a verb or a noun, is used. This is similar to
Hebrew, which also uses the third-person singular masculine perfect form, e.g., ברא ''bara'
create, כפר kaphar
deny. Georgian uses the verbal noun, as does Bangla. For Korean, -da'' is attached to the stem. In
Tamil, an
agglutinative language, the verb stem (which is also the imperative form – the least marked one) is often cited, e.g.,
இரு In
Irish, words are highly inflected by case (genitive, nominative, dative and vocative) and by their place within a sentence because of
initial mutations. The noun
cainteoir, the lemma for the noun meaning "speaker", has a variety of forms:
chainteoir,
gcainteoir,
cainteora,
chainteora,
cainteoirí,
chainteoirí and
gcainteoirí. Some phrases are cited in a sort of lemma:
Carthago delenda est (literally, "Carthage must be destroyed") is a common way of citing
Cato, but what he said was nearer to
censeo Carthaginem esse delendam ("I hold Carthage to be in need of destruction"). == Lexicography ==