MarketAncestral sequence reconstruction
Company Profile

Ancestral sequence reconstruction

Ancestral sequence reconstruction (ASR) – also known as ancestral gene/sequence reconstruction/resurrection – is a technique used in the study of molecular evolution. The method uses related sequences to reconstruct an "ancestral" gene from a multiple sequence alignment.

Principles
ASR is based on the observation that closely related species have similar DNA sequences (see Figure 2). For instance, if 2 species differ in 1 nucleotide, e.g. A in humans and G in chimpanzees, we can safely assume that an ancestor had either A or G and that this nucleotide has mutated in one of the lineages ("safely" because it is statistically very unlikely that a nucleotide would mutate and then mutate back again). How can we determine whether the ancestor had an A or a G? We look at one or more outgroups! If gorillas and orangutans both have an A, it is safe to assume that this was the ancestral nucleotide, and that the mutation A→G happened in the lineage leading to chimps (see Figures 1 and 2). Experimental verification. Most ASR studies are conducted in vitro, and have revealed ancestral protein properties that seem to be evolutionarily desirable traits – such as increased thermostability, catalytic activity and catalytic promiscuity. These data have been accredited to artifacts of the ASR algorithms, as well as indicative illustrations of ancient Earth's environment – often, ASR research must be complemented with extensive controls (usually alternate ASR experiments) to mitigate algorithmic error. Not all studied ASR proteins exhibit this so-called 'ancestral superiority'. The nascent field of 'evolutionary biochemistry' has been bolstered by the recent increase in ASR studies using the ancestors as ways to probe organismal fitness within certain cellular contexts – effectively testing ancestral proteins in vivo. Due to inherent limitations in these sorts of studies – primarily being the lack of suitably ancient genomes to fit these ancestors in to, the small repertoire of well categorized laboratory model systems, and the inability to mimic ancient cellular environments; very few ASR studies in vivo have been conducted. Despite the above mentioned obstacles, preliminary insights into this avenue of research from a 2015 paper, have revealed that observed 'ancestral superiority' in vitro were not recapitulated in vivo of a given protein. ASR presents one of a few mechanisms to study biochemistry of the Precambrian era of life (>541Ma) and is hence often used in 'paleogenetics'; indeed Zuckerkandl and Pauling originally intended ASR to be the starting point of a field they termed 'Paleobiochemistry'. ==Methodology==
Methodology
Several related homologues of the protein of interest are first aligned in a multiple sequence alignment (MSA), then a 'phylogenetic tree' is constructed with inferred sequences at the nodes of the branches. It is these sequences that are the so-called 'ancestors. Ancestral sequences are typically calculated by maximum likelihood, however Bayesian methods are also implemented. Because the ancestors are inferred from a phylogeny, the topology and composition of the phylogeny plays a major role in the output ASR sequences. These sequences are then compared and often several (~10) are expressed and studied per phylogenetic node. ASR does not claim to recreate the actual sequence of the ancient protein/DNA, but rather a sequence that is likely to be similar to the one that was indeed at the node. This is not considered a shortcoming of ASR as it fits into the 'neutral network' model of protein evolution, whereby at evolutionary junctions (nodes) a population of genotypically different but phenotypically similar protein sequences existed in the extant organismal population. Hence, it is possible that ASR would generate one of the sequences of a node's neutral network and while it may not represent the genotype of the last common ancestor of the modern day sequences, it does likely represent the phenotype. In the trend of increasing thermostability, one explanation is that ML ASR creates a consensus sequence of several different, parallel mechanisms evolved to confer minor protein thermostability throughout the phylogeny – leading to an additive effect resulting in 'superior' ancestral thermostability. Experimental validation. The expression of consensus sequences and parallel ASR via non-ML methods are often required to disband this theory per experiment. One other concern raised by ML methods is that the scoring matrices are derived from modern sequences and particular amino acid frequencies seen today may not be the same as in Precambrian biology, resulting in skewed sequence inference. Several studies have attempted to construct ancient scoring matrices via various methodologies and have compared the resultant sequences and their protein's biophysical properties. While these modified sequences result in somewhat different ASR sequences, the observed biophysical properties did not seem to vary outside from experimental error. Because of the 'holistic' nature of ASR and the intense complexity that arises when one considers all the possible sources of experimental error – the experimental community considers the ultimate measurement of ASR reliability to be the comparison of several alternate ASR reconstructions of the same node and the identification of similar biophysical properties. While this method does not offer a robust statistical, mathematical measure of reliability it does build off of the fundamental idea used in ASR that individual amino acid substitutions do not cause significant biophysical property changes in a protein – a tenant that must be held true in order to be able to overcome the effect of inference ambiguity. Candidates used for ASR are often selected based on the particular property of interest being studied – e.g. thermostability. ==Resurrected proteins==
Resurrected proteins
There are many examples of ancestral proteins that have been computationally reconstructed, expressed in living cell lines, and – in many cases – purified and biochemically studied. • The Thornton lab notably resurrected several ancestral hormone receptors (from about 500Ma) and collaborated with the Stevens lab to resurrect ancient V-ATPase subunits from yeast (800Ma). • The Marqusee lab has recently published several studies concerning the evolutionary biophysical history of E. coli Ribonuclease H1. Some other examples are ancestral visual pigments in vertebrates, enzymes in yeast that break down sugars (800Ma); enzymes in bacteria that provide resistance to antibiotics (2 – 3Ga); the ribonucleases involved in ruminant digestion; the alcohol dehydrogenases (Adhs) involved in yeast fermentation(~85Ma); The 'age' of a reconstructed sequence is determined using a molecular clock model, and often several are employed. This dating technique is often calibrated using geological time-points (such as ancient ocean constituents or BIFs) and while these clocks offer the only method of inferring a very ancient protein's age, they have sweeping error margins and are difficult to defend against contrary data. To this end, ASR 'age' should really be only used as an indicative feature and is often surpassed altogether for a measurement of the number of substitutions between the ancestral and the modern sequences (the fundament on which the clock is calculated). Thioredoxin One example is the reconstruction of thioredoxin enzymes from up to 4 billion year old organisms. Whereas the chemical activity of these reconstructed enzymes were remarkably similar to modern enzymes, their physical properties showed significantly elevated thermal and acidic stability. These results were interpreted as suggesting that ancient life may have evolved in oceans that were much hotter and more acidic than today. ==Significance==
Significance
These experiments address various important questions in evolutionary biology: does evolution proceed in small steps or in large leaps; is evolution reversible; how does complexity evolve? It has been shown that slight mutations in the amino acid sequence of hormone receptors determine an important change in their preferences for hormones. These changes mean huge steps in the evolution of the endocrine system. Thus very small changes at the molecular level may have enormous consequences. The Thornton lab has also been able to show that evolution is irreversible studying the glucocorticoid receptor. This receptor was changed by seven mutations in a cortisol receptor, but reversing these mutations didn't give the original receptor back. Indicating that epistasis plays a major role in protein evolution – an observation that in combination with the observations of several examples of parallel evolution, support the neutral network model mentioned above. These different experiments on receptors show that, during their evolution, proteins are greatly differentiated and this explains how complexity may evolve. A closer look at the different ancestral hormone receptors and the various hormones shows that at the level of interaction between single amino acid residues and chemical groups of the hormones arise by very small but specific changes. Knowledge about these changes may for example lead to the synthesis of hormonal equivalents capable of mimicking or inhibiting the action of a hormone, which might open possibilities for new therapies. Given that ASR has revealed a tendency towards ancient thermostability and enzymatic promiscuity, ASR poses as a valuable tool for protein engineers who often desire these traits (producing effects sometimes greater than current, rationally lead tools). ASR also promises to 'resurrect' phenotypically similar 'ancient organisms' which in turn would allow evolutionary biochemists to probe the story of life. Proponents of ASR such as Benner state that through these and other experiments, the end of the current century will see a level of understanding in biology analogous to the one that arose in classical chemistry in the last century. == References ==
tickerdossier.comtickerdossier.substack.com