By effect on structure The sequence of a gene can be altered in a number of ways. Gene mutations have varying effects on health depending on where they occur and whether they alter the function of essential proteins. Mutations in the structure of genes can be classified into several types.
Large-scale mutations Large-scale mutations in
chromosomal structure include: • Amplifications (or
gene duplications) or repetition of a chromosomal segment or presence of extra piece of a chromosome broken piece of a chromosome may become attached to a homologous or non-homologous chromosome so that some of the genes are present in more than two doses leading to multiple copies of all chromosomal regions, increasing the dosage of the genes located within them. •
Polyploidy, duplication of entire sets of chromosomes, potentially resulting in a separate breeding population and
speciation. • Deletions of large chromosomal regions, leading to loss of the genes within those regions. • Mutations whose effect is to juxtapose previously separate pieces of DNA, potentially bringing together separate genes to form functionally distinct
fusion genes (e.g.,
bcr-abl). • Large scale changes to the structure of
chromosomes called
chromosomal rearrangement that can lead to a decrease of fitness but also to speciation in isolated, inbred populations. These include: •
Chromosomal translocations: interchange of genetic parts from nonhomologous chromosomes. •
Chromosomal inversions: reversing the orientation of a chromosomal segment. • Non-homologous
chromosomal crossover. • Interstitial deletions: an intra-chromosomal deletion that removes a segment of DNA from a single chromosome, thereby apposing previously distant genes. For example, cells isolated from a human
astrocytoma, a type of brain tumour, were found to have a chromosomal deletion removing sequences between the Fused in
Glioblastoma (FIG) gene and the receptor tyrosine kinase (ROS), producing a fusion protein (FIG-ROS). The abnormal FIG-ROS fusion protein has constitutively active kinase activity that causes
oncogenic transformation (a transformation from normal cells to cancer cells). •
Loss of heterozygosity: loss of one
allele, either by a deletion or a genetic recombination event, in an organism that previously had two different alleles.
Small-scale mutations Small-scale mutations affect a gene in one or a few nucleotides. (If only a single nucleotide is affected, they are called
point mutations.) Small-scale mutations include: •
Insertions add one or more extra nucleotides into the DNA. They are usually caused by
transposable elements, or errors during replication of repeating elements. Insertions in the coding region of a gene may alter
splicing of the
mRNA (
splice site mutation), or cause a shift in the
reading frame (
frameshift), both of which can significantly alter the
gene product. Insertions can be reversed by excision of the transposable element. •
Deletions remove one or more nucleotides from the DNA. Like insertions, these mutations can alter the reading frame of the gene. In general, they are irreversible: Though exactly the same sequence might, in theory, be restored by an insertion, transposable elements able to revert a very short deletion (say 1–2 bases) in
any location either are highly unlikely to exist or do not exist at all. •
Substitution mutations, often caused by chemicals or malfunction of DNA replication, exchange a single nucleotide for another. These changes are classified as transitions or transversions. Most common is the transition that exchanges a purine for a purine (A ↔ G) or a
pyrimidine for a pyrimidine, (C ↔ T). A transition can be caused by nitrous acid, base mispairing, or mutagenic base analogues such as BrdU. Less common is a transversion, which exchanges a purine for a pyrimidine or a pyrimidine for a purine (C/T ↔ A/G). An example of a transversion is the conversion of
adenine (A) into a cytosine (C). Point mutations are modifications of single base pairs of DNA or other small base pairs within a gene. A point mutation can be reversed by another point mutation, in which the nucleotide is changed back to its original state (true reversion) or by second-site reversion (a complementary mutation elsewhere that results in regained gene functionality). As discussed
below, point mutations that occur within the protein
coding region of a gene may be classified as
synonymous or
nonsynonymous substitutions, the latter of which in turn can be divided into
missense or
nonsense mutations.
By impact on protein sequence protein-coding gene. A mutation in the
protein coding region (red) can result in a change in the amino acid sequence. Mutations in other areas of the gene can have diverse effects. Changes within
regulatory sequences (yellow and blue) can effect
transcriptional and
translational regulation of
gene expression. of
amino acids The effect of a mutation on protein sequence depends in part on where in the genome it occurs, especially whether it is in a
coding or
non-coding region. Mutations in the non-coding
regulatory sequences of a gene, such as promoters, enhancers, and silencers, can alter levels of gene expression, but are less likely to alter the protein sequence. Mutations within
introns and in regions with no known biological function (e.g.
pseudogenes,
retrotransposons) are generally
neutral, having no effect on phenotype – though intron mutations could alter the protein product if they affect mRNA splicing. Mutations that occur in coding regions of the genome are more likely to alter the protein product, and can be categorized by their effect on amino acid sequence: • A
frameshift mutation is caused by insertion or deletion of a number of nucleotides that is not evenly divisible by three from a DNA sequence. Due to the triplet nature of gene expression by codons, the insertion or deletion can disrupt the reading frame, or the grouping of the codons, resulting in a completely different
translation from the original. The earlier in the sequence the deletion or insertion occurs, the more altered the protein produced is. (For example, the code CCU GAC UAC CUA codes for the amino acids proline, aspartic acid, tyrosine, and leucine. If the U in CCU was deleted, the resulting sequence would be CCG ACU ACC UAx, which would instead code for proline, threonine, threonine, and part of another amino acid or perhaps a
stop codon (where the x stands for the following nucleotide).) By contrast, any insertion or deletion that is evenly divisible by three is termed an
in-frame mutation. • A point substitution mutation results in a change in a single nucleotide and can be either synonymous or nonsynonymous. • A
synonymous substitution replaces a codon with another codon that codes for the same amino acid, so that the produced amino acid sequence is not modified. Synonymous mutations occur due to the
degenerate nature of the
genetic code. If this mutation does not result in any phenotypic effects, then it is called
silent, but not all synonymous substitutions are silent. (There can also be silent mutations in nucleotides outside of the coding regions, such as the introns, because the exact nucleotide sequence is not as crucial as it is in the coding regions, but these are not considered synonymous substitutions.) • A
nonsynonymous substitution replaces a codon with another codon that codes for a different amino acid, so that the produced amino acid sequence is modified. Nonsynonymous substitutions can be classified as nonsense or missense mutations: • A
missense mutation changes a nucleotide to cause substitution of a different amino acid. This in turn can render the resulting protein nonfunctional. Such mutations are responsible for diseases such as
Epidermolysis bullosa,
sickle-cell disease, and
SOD1-mediated
ALS. On the other hand, if a missense mutation occurs in an amino acid codon that results in the use of a different, but chemically similar, amino acid, then sometimes little or no change is rendered in the protein. For example, a change from AAA to AGA will encode
arginine, a chemically similar molecule to the intended
lysine. In this latter case the mutation will have little or no effect on phenotype and therefore be
neutral. • A
nonsense mutation is a point mutation in a sequence of DNA that results in a premature stop codon, or a
nonsense codon in the transcribed mRNA, and possibly a truncated, and often nonfunctional protein product. This sort of mutation has been linked to different diseases, such as
congenital adrenal hyperplasia. (See
Stop codon.)
By effect on function A mutation becomes an effect on function mutation when the exactitude of functions between a mutated protein and its direct interactor undergoes change. The interactors can be other proteins, molecules, nucleic acids, etc. There are many mutations that fall under the category of by effect on function, but depending on the specificity of the change the mutations listed below will occur. • Loss-of-function mutations, also called inactivating mutations, result in the gene product having less or no function (being partially or wholly inactivated). When the allele has a complete loss of function (
null allele), it is often called an
amorph or amorphic mutation in
Muller's morphs schema. Phenotypes associated with such mutations are most often
recessive. Exceptions are when the organism is
haploid, or when the reduced dosage of a normal gene product is not enough for a normal phenotype (this is called
haploinsufficiency). Examples of diseases caused by a loss-of-function mutation include
Gitelman syndrome and
cystic fibrosis. • Gain-of-function mutations also called activating mutations, change the gene product such that its effect gets stronger (enhanced activation) or even is superseded by a different and abnormal function. When the new allele is created, a
heterozygote containing the newly created allele as well as the original will express the new allele; genetically this defines the mutations as
dominant phenotypes. Several of Muller's morphs correspond to the gain of function, including hypermorph (increased gene expression) and neomorph (novel function). • Dominant negative mutations (also called anti-morphic mutations) have an altered gene product that acts antagonistically to the wild-type allele. These mutations usually result in an altered molecular function (often inactive) and are characterized by a dominant or
semi-dominant phenotype. In humans, dominant negative mutations have been implicated in cancer (e.g., mutations in genes
p53,
ATM,
CEBPA, and
PPARgamma).
Marfan syndrome is caused by mutations in the
FBN1 gene, located on
chromosome 15, which encodes fibrillin-1, a
glycoprotein component of the
extracellular matrix. Marfan syndrome is also an example of dominant negative mutation and haploinsufficiency. • Lethal mutations result in rapid organismal death when occurring during development and cause significant reductions of life expectancy for developed organisms. An example of a disease that is caused by a dominant lethal mutation is
Huntington's disease. • Null mutations, also known as amorphic mutations, are a form of loss-of-function mutations that completely prohibit the gene's function. The mutation leads to a complete loss of operation at the phenotypic level, also causing no gene product to be formed.
Atopic eczema and dermatitis syndrome are common diseases caused by a null mutation of the gene that activates filaggrin. • Suppressor mutations are a type of mutation that causes the double mutation to appear normally. In suppressor mutations the phenotypic activity of a different mutation is completely suppressed, thus causing the double mutation to look normal. There are two types of suppressor mutations:
intragenic and extragenic. Intragenic mutations occur in the gene where the first mutation occurs, while extragenic mutations occur in the gene that interacts with the product of the first mutation. A common disease that results from this type of mutation is
Alzheimer's disease. • Neomorphic mutations are a part of the gain-of-function mutations and are characterized by the control of new protein product synthesis. The newly synthesized gene normally contains a novel gene expression or molecular function. The result of the neomorphic mutation is the gene where the mutation occurs has a complete change in function. • A back mutation or reversion is a point mutation that restores the original sequence and hence the original phenotype.
By effect on fitness (harmful, beneficial, neutral mutations) In
genetics, it is sometimes useful to classify mutations as either
or beneficial (or
neutral): • A harmful, or , mutation decreases the fitness of the organism. Many, but not all mutations in
essential genes are harmful (if a mutation does not change the amino acid sequence in an essential protein, it is harmless in most cases). • A beneficial, or advantageous mutation increases the fitness of the organism. Examples are mutations that lead to
antibiotic resistance in bacteria (which are beneficial for bacteria but usually not for humans). • A neutral mutation has no harmful or beneficial effect on the organism. Such mutations occur at a steady rate, forming the basis for the
molecular clock. In the
neutral theory of molecular evolution, neutral mutations provide genetic drift as the basis for most variation at the molecular level. In animals or plants, most mutations are neutral, given that the vast majority of their genomes is either non-coding or consists of repetitive sequences that have no obvious function ("
junk DNA").
Large-scale quantitative mutagenesis screens, in which thousands of millions of mutations are tested, invariably find that a larger fraction of mutations has harmful effects but always returns a number of beneficial mutations as well. For instance, in a screen of all gene deletions in
E. coli, 80% of mutations were negative, but 20% were positive, even though many had a very small effect on growth (depending on condition). Gene
deletions involve removal of whole genes, so that point mutations almost always have a much smaller effect. In a similar screen in
Streptococcus pneumoniae, but this time with
transposon insertions, 76% of insertion mutants were classified as neutral, 16% had a significantly reduced fitness, but 6% were advantageous. This classification is obviously relative and somewhat artificial: a harmful mutation can quickly turn into a beneficial mutations when conditions change. Also, there is a gradient from harmful/beneficial to neutral, as many mutations may have small and mostly neglectable effects but under certain conditions will become relevant. Also, many traits are determined by hundreds of genes (or loci), so that each locus has only a minor effect. For instance, human height is determined by hundreds of genetic variants ("mutations") but each of them has a very minor effect on height, apart from the impact of
nutrition. Height (or size) itself may be more or less beneficial as the huge range of sizes in animal or plant groups shows.
Distribution of fitness effects (DFE) Attempts have been made to infer the distribution of fitness effects (DFE) using
mutagenesis experiments and theoretical models applied to molecular sequence data. DFE, as used to determine the relative abundance of different types of mutations (i.e., strongly deleterious, nearly neutral or advantageous), is relevant to many evolutionary questions, such as the maintenance of
genetic variation, the rate of
genomic decay, the maintenance of
outcrossing sexual reproduction as opposed to
inbreeding and the evolution of
sex and
genetic recombination. DFE can also be tracked by tracking the skewness of the distribution of mutations with putatively severe effects as compared to the distribution of mutations with putatively mild or absent effect. In summary, the DFE plays an important role in predicting
evolutionary dynamics. A variety of approaches have been used to study the DFE, including theoretical, experimental and analytical methods. • Mutagenesis experiment: The direct method to investigate the DFE is to induce mutations and then measure the mutational fitness effects, which has already been done in viruses,
bacteria, yeast, and
Drosophila. For example, most studies of the DFE in viruses used
site-directed mutagenesis to create point mutations and measure relative fitness of each mutant. In
Escherichia coli, one study used
transposon mutagenesis to directly measure the fitness of a random insertion of a derivative of
Tn10. In yeast, a combined mutagenesis and
deep sequencing approach has been developed to generate high-quality systematic mutant libraries and measure fitness in high throughput. However, given that many mutations have effects too small to be detected and that mutagenesis experiments can detect only mutations of moderately large effect; DNA
sequence analysis can provide valuable information about these mutations. . In this experiment, random mutations were introduced into the virus by site-directed mutagenesis, and the
fitness of each mutant was compared with the ancestral type. A fitness of zero, less than one, one, more than one, respectively, indicates that mutations are lethal, deleterious, neutral, and advantageous. By examining DNA sequence differences within and between species, we are able to infer various characteristics of the DFE for neutral, deleterious and advantageous mutations. A later proposal by Hiroshi Akashi proposed a
bimodal model for the DFE, with modes centered around highly deleterious and neutral mutations. Both theories agree that the vast majority of novel mutations are neutral or deleterious and that advantageous mutations are rare, which has been supported by experimental results. One example is a study done on the DFE of random mutations in
vesicular stomatitis virus. Like neutral mutations, weakly selected advantageous mutations can be lost due to random genetic drift, but strongly selected advantageous mutations are more likely to be fixed. Knowing the DFE of advantageous mutations may lead to increased ability to predict the evolutionary dynamics. Theoretical work on the DFE for advantageous mutations has been done by
John H. Gillespie and
H. Allen Orr. They proposed that the distribution for advantageous mutations should be
exponential under a wide range of conditions, which, in general, has been supported by experimental studies, at least for strongly selected advantageous mutations. In general, it is accepted that the majority of mutations are neutral or deleterious, with advantageous mutations being rare; however, the proportion of types of mutations varies between species. This indicates two important points: first, the proportion of effectively neutral mutations is likely to vary between species, resulting from dependence on
effective population size; second, the average effect of deleterious mutations varies dramatically between species. which involve cells outside the dedicated reproductive group and which are not usually transmitted to descendants. Diploid organisms (e.g., humans) contain two copies of each gene—a paternal and a maternal allele. Based on the occurrence of mutation on each chromosome, we may classify mutations into three types. A
wild type or homozygous non-mutated organism is one in which neither allele is mutated. • A heterozygous mutation is a mutation of only one allele. • A homozygous mutation is an identical mutation of both the paternal and maternal alleles. •
Compound heterozygous mutations or a genetic compound consists of two different mutations in the paternal and maternal alleles.
Germline mutation A germline mutation in the reproductive cells of an individual gives rise to a
constitutional mutation in the offspring, that is, a mutation that is present in every cell. A constitutional mutation can also occur very soon after
fertilization, or continue from a previous constitutional mutation in a parent. A germline mutation can be passed down through subsequent generations of organisms. The distinction between germline and somatic mutations is important in animals that have a dedicated germline to produce reproductive cells. However, it is of little value in understanding the effects of mutations in plants, which lack a dedicated germline. The distinction is also blurred in those animals that
reproduce asexually through mechanisms such as
budding, because the cells that give rise to the daughter organisms also give rise to that organism's germline. A new germline mutation not inherited from either parent is called a '''
de novo mutation'''.
Somatic mutation A change in the genetic structure that is not inherited from a parent, and also not passed to offspring, is called a
somatic mutation
. With plants, some somatic mutations can be propagated without the need for seed production, for example, by
grafting and stem cuttings. These types of mutations have led to new types of fruits, such as the "Delicious"
apple and the "Washington" navel
orange. Human and mouse
somatic cells have a mutation rate more than ten times higher than the
germline mutation rate for both species; mice have a higher rate of both somatic and germline mutations per
cell division than humans. The disparity in mutation rate between the germline and somatic tissues likely reflects the greater importance of
genome maintenance in the germline than in the soma.
Special classes •
Conditional mutation is a mutation that has wild-type (or less severe) phenotype under certain "permissive" environmental conditions and a mutant phenotype under certain "restrictive" conditions. For example, a temperature-sensitive mutation can cause cell death at high temperature (restrictive condition), but might have no deleterious consequences at a lower temperature (permissive condition). These mutations are non-autonomous, as their manifestation depends upon presence of certain conditions, as opposed to other mutations which appear autonomously. The permissive conditions may be
temperature, certain chemicals, light Conditional mutations have applications in research as they allow control over gene expression. This is especially useful studying diseases in adults by allowing expression after a certain period of growth, thus eliminating the deleterious effect of gene expression seen during stages of development in model organisms. Conditional mutations may also be used in genetic studies associated with ageing, as the expression can be changed after a certain time period in the organism's lifespan. which should be used by researchers and
DNA diagnostic centers to generate unambiguous mutation descriptions. In principle, this nomenclature can also be used to describe mutations in other organisms. The nomenclature specifies the type of mutation and base or amino acid changes. • Nucleotide substitution (e.g., 76A>T) – The number is the position of the nucleotide from the 5' end; the first letter represents the wild-type nucleotide, and the second letter represents the nucleotide that replaced the wild type. In the given example, the adenine at the 76th position was replaced by a thymine. • If it becomes necessary to differentiate between mutations in
genomic DNA,
mitochondrial DNA, and
RNA, a simple convention is used. For example, if the 100th base of a nucleotide sequence mutated from G to C, then it would be written as g.100G>C if the mutation occurred in genomic DNA, m.100G>C if the mutation occurred in mitochondrial DNA, or r.100g>c if the mutation occurred in RNA. Note that, for mutations in RNA, the nucleotide code is written in lower case. • Amino acid substitution (e.g., D111E) – The first letter is the one letter
code of the wild-type amino acid, the number is the position of the amino acid from the
N-terminus, and the second letter is the one letter code of the amino acid present in the mutation. Nonsense mutations are represented with an X for the second amino acid (e.g. D111X). • Amino acid deletion (e.g., ΔF508) – The Greek letter Δ (
delta) indicates a deletion. The letter refers to the amino acid present in the wild type and the number is the position from the N terminus of the amino acid were it to be present as in the wild type. == Mutation rates ==