An
intein is a segment of a
protein that is able to excise itself and join the remaining portions (the
exteins) with a
peptide bond during protein splicing. Inteins have also been called
protein introns, by analogy with (RNA)
introns.
Naming conventions The first part of an intein name is based on the
scientific name of the
organism in which it is found, and the second part is based on the name of the corresponding gene or extein. For example, the intein found in
Thermoplasma acidophilum and associated with Vacuolar ATPase subunit A (VMA) is called "Tac VMA". Normally, as in this example, just three letters suffice to specify the organism, but there are variations. For example, additional letters may be added to indicate a strain. If more than one intein is encoded in the corresponding gene, the inteins are given a numerical suffix starting from
5 to 3 or in order of their identification (for example, "Msm dnaB-1"). The segment of the gene that encodes the intein is usually given the same name as the intein, but to avoid confusion the name of the intein proper is usually capitalized (
e.g., Pfu RIR1-1), whereas the name of the corresponding gene segment is italicized (
e.g., Pfu
rir1-1). A different disambiguating convention is to place a lowercase "i" after the source protein name,
e.g. "Msm DnaBi1".
Types of inteins Inteins can be classified on many criteria. • Based on how they splice themselves out, they can be classified into
cis-splicing (which means that they splice themselves out) or
trans-splicing (which means they need outside help). Most studied inteins are cis-splicing. Split inteins (see below) usually involves two halves helping each other out, so they are
trans-splicing. • Based on whether they contain the endonuclease domain. Ones that have an endonuclease domain is called a "maxi-intein", otherwise a "mini-intein".
Full and mini inteins Inteins can contain a
homing endonuclease gene (HEG) domain in addition to the splicing domains. This domain is responsible for the spread of the intein by cleaving DNA at an intein-free
allele on the
homologous chromosome, triggering the
DNA double-stranded break repair (DSBR) system, which then repairs the break, thus copying the intein-coding DNA into a previously intein-free site. The HEG domain is not necessary for intein splicing, and so it can be lost, forming a
minimal, or
mini,
intein. Several studies have demonstrated the modular nature of inteins by adding or removing HEG domains and determining the activity of the new construct.
Split inteins Sometimes, the intein of the precursor protein comes from two genes. In this case, the intein is said to be a
split intein. For example, in
cyanobacteria,
DnaE, the catalytic subunit α of
DNA polymerase III, is encoded by two separate genes,
dnaE-n and
dnaE-c. The
dnaE-n product consists of an N-extein sequence followed by a 123-AA intein sequence, whereas the
dnaE-c product consists of a 36-AA intein sequence followed by a C-extein sequence.{{Cite journal | doi=10.1073/pnas.95.16.9226 | last1=Wu | first1=H. | last2=Hu | first2=Z. | last3=Liu | first3=X. Q. ==Applications in biotechnology==