Several models exist that try to explain how new cellular functions of genes and their encoded protein products evolve through the mechanism of duplication and divergence. Although each model can explain certain aspects of the evolutionary process, the relative importance of each aspect is still unclear. This page only presents which theoretical models are currently discussed in the literature. Review articles on this topic can be found at the bottom. In the following, a distinction will be made between explanations for the short-term effects (preservation) of a gene duplication and its long-term outcomes.
Preservation of gene duplicates Since a gene duplication occurs in only one cell, either in a single-celled organism or in the germ cell of a multi-cellular organism, its carrier (i.e. the organism) usually has to compete against other organisms that do not carry the duplication. If the duplication disrupts the normal functioning of an organism, the organism has a reduced reproductive success (or low
fitness) compared to its competitors and will most likely die out rapidly. If the duplication has no effect on fitness, it might be maintained in a certain proportion of a population. In certain cases, the duplication of a certain gene might be immediately beneficial, providing its carrier with a fitness advantage.
Dosage effect or gene amplification The so-called '
dosage' of a gene refers to the amount of mRNA transcripts and subsequently translated protein molecules produced from a gene per time and per cell. If the amount of
gene product is below its optimal level, there are two kinds of mutations that can increase dosage: increases in
gene expression by
promoter mutations and increases in gene copy number by gene duplication. The more copies of the same (duplicated) gene a cell has in its genome, the more gene product can be produced simultaneously. Assuming that no regulatory feedback loops exist that automatically down-regulate gene expression, the amount of gene product (or gene dosage) will increase with each additional gene copy, until some upper limit is reached or sufficient gene product is available. Furthermore, under positive selection for increased dosage, a duplicated gene could be immediately advantageous and quickly increase in frequency in a population. In this case, no further mutations would be necessary to preserve (or retain) the duplicates. However, at a later time, such mutations could still occur, leading to genes with different functions (see below). Gene dosage effects after duplication can also be harmful to a cell and the duplication might therefore be selected against. For instance, when the
metabolic network within a cell is fine-tuned so that it can only tolerate a certain amount of a certain gene product, gene duplication would offset this balance.
Activity reducing mutations In cases of gene duplications that have no immediate fitness effect, a retention of the duplicate copy could still be possible if both copies accumulate mutations that for instance reduce the functional efficiency of the encoded proteins without inhibiting this function altogether. In such a case, the molecular function (e.g. protein/enzyme activity) would still be available to the cell to at least the extent that was available before duplication (now provided by proteins expressed from two gene loci, instead of one gene locus). However, the accidental loss of one gene copy might then be detrimental, since one copy of the gene with reduced activity would almost certainly lie below the activity that was available before duplication.
Long-term fate of duplicated genes If a gene duplication is preserved, the most likely fate is that random mutations in one duplicate gene copy will eventually cause the gene to become non-functional . Such non-functional remnants of genes, with detectable sequence
homology, can sometimes still be found in
genomes and are called
pseudogenes.
Functional divergence between the duplicate genes is another possible fate. There are several theoretical models that try to explain the mechanisms leading to divergence:
Neofunctionalization The term
neofunctionalization was first coined by Force et al. 1999, but it refers to the general mechanism proposed by Ohno 1970.
IAD model IAD stands for 'innovation, amplification, divergence' and aims to explain evolution of new gene functions while preserving its existing functions. Innovation, i.e. the establishment of a new molecular function, can occur via side-activities of genes and thus proteins this is called
Enzyme promiscuity. For example, enzymes can sometimes catalyse more than just one reaction, even though they usually are optimised for catalysing just one reaction. Such promiscuous protein functions, if they provide an advantage to the host organism, can then be amplified with additional copies of the gene. Such a rapid amplification is best known from bacteria that often carry certain genes on smaller non-chromosomal DNA molecules (called plasmids) which are capable of rapid replication. Any gene on such a plasmid is also replicated and the additional copies amplify the expression of the encoded proteins, and with it any promiscuous function. After several such copies have been made, and are also passed on to descendent bacterial cells, a few of these copies might accumulate mutations that eventually will lead to a side-activity becoming the main activity. The IAD model have been previously tested in the lab by using bacterial enzyme with dual function as starting point. This enzyme is capable of catalyzing not only its original function, but also side function that can carried out by other enzyme. By allowing the bacteria with this enzyme to evolve under selection to improve both activities (original and side) for several generations, it was shown that one ancestral bifunctional gene with poor activities (Innovation) evolved first by gene amplification to increase expression of the poor enzyme, and later accumulated more beneficial mutations that improved one or both of the activities that can be passed on to the next generation (divergence) The evolutionary process described by the EAC model actually begins before the gene duplication event. A singleton (not duplicated) gene evolves towards two beneficial functions simultaneously. This creates an "adaptive conflict" for the gene, since it is unlikely to execute each individual function with maximum efficiency. The intermediate evolutionary result could be a multi-functional gene and after a gene duplication its sub-functions could be carried out by specialised descendants of the gene. The result would be the same as under the DDC model, two functionally specialised genes (paralogs). In contrast to the DDC model, the EAC model puts more emphasis on the multi-functional pre-duplication state of the evolving genes and gives a slightly different explanation as to why the duplicated multi-functional genes would benefit from additional specialisation after duplication (because of the adaptive conflict of the multi-functional ancestor that needs to be resolved). Under EAC there is an assumption of a positive selection pressure driving evolution after gene duplication, whereas the DDC model only requires neutral ("undirected") evolution to take place, i.e. degeneration and complementation. ==See also==