MarketRepeated sequence (DNA)
Company Profile

Repeated sequence (DNA)

Repeated sequences are short or long patterns that occur in multiple copies throughout the genome. In many organisms, a significant fraction of the genomic DNA is repetitive, with over two-thirds of the sequence consisting of repetitive elements in humans. Some of these repeated sequences are necessary for maintaining important genome structures such as telomeres or centromeres.

History
In the 1950s, Barbara McClintock first observed DNA transposition and illustrated the functions of the centromere and telomere at the Cold Spring Harbor Symposium. McClintock's work set the stage for the discovery of repeated sequences because transposition, centromere structure, and telomere structure are all possible through repetitive elements, yet this was not fully understood at the time. The term "repeated sequence" was first used by Roy John Britten and D. E. Kohne in 1968; they found out that more than half of the eukaryotic genomes were repetitive DNA through their experiments on reassociation of DNA. Although the repetitive DNA sequences were conserved and ubiquitous, their biological role was yet unknown. In the 1990s, more research was conducted to elucidate the evolutionary dynamics of minisatellite and microsatellite repeats because of their importance in DNA-based forensics and molecular ecology. DNA-dispersed repeats were increasingly recognized as a potential source of genetic variation and regulation. Discoveries of deleterious repetitive DNA-related diseases stimulated further interest in this area of study. In the 2000s, the data from full eukaryotic genome sequencing enabled the identification of different promoters, enhancers, and regulatory RNAs which are all coded by repetitive regions. Today, the structural and regulatory roles of repetitive DNA sequences remain an active area of research. == Types and functions ==
Types and functions
Many repeat sequences are likely to be non-functional, decaying remnants of Transposable elements, these have been labelled "junk" or "selfish" DNA. Nevertheless, occasionally some repeats may be exapted for other functions. Tandem repeats Tandem repeats are repeated sequences which are directly adjacent to each other in the genome. Tandem repeats may vary in the number of nucleotides comprising the repeated sequence, as well as the number of times the sequence repeats. When the repeating sequence is only 2–10 nucleotides long, the repeat is referred to as a short tandem repeat (STR) or microsatellite. When the repeating sequence is 10–60 nucleotides long, the repeat is referred to as a minisatellite. For minisatellites and microsatellites, the number of times the sequence repeats at a single locus can range from twice to hundreds of times. Tandem repeats have a wide variety of biological functions in the genome. For example, minisatellites are often hotspots of meiotic homologous recombination in eukaryotic organisms. Recombination is when two homologous chromosomes align, break, and rejoin to swap pieces. Recombination is important as a source of genetic diversity, as a mechanism for repairing damaged DNA, and a necessary step in the appropriate segregation of chromosomes in meiosis. These repeats fold into highly organized G quadruplex structures which protect the ends of chromosomal DNA from degradation. Repetitive elements are enriched in the middle of chromosomes as well. Centromeres are the highly compact regions of chromosomes which join sister chromatids together and also allow the mitotic spindle to attach and separate sister chromatids during cell division. Centromeres are composed of a 177 base pair tandem repeat named the α-satellite repeat. Some repetitive sequences, such as those with structural roles discussed above, play roles necessary for proper biological functioning. Other tandem repeats have deleterious roles which drive diseases. Many other tandem repeats, however, have unknown or poorly understood functions. Interspersed repeats Interspersed repeats are identical or similar DNA sequences which are found in different locations throughout the genome. Interspersed repeats are distinguished from tandem repeats in that the repeated sequences are not directly adjacent to each other but instead may be scattered among different chromosomes or far apart on the same chromosome. Most interspersed repeats are transposable elements (TEs), mobile sequences which can be "cut and pasted" or "copied and pasted" into different places in the genome. TEs were originally called "jumping genes" for their ability to move, yet this term is somewhat misleading as not all TEs are discrete genes. Transposable elements that are transcribed into RNA, reverse-transcribed into DNA, then reintegrated into the genome are called retrotransposons. Short interspersed nuclear elements (SINEs) are typically 100-300 base pairs and no longer than 600 base pairs. Since uncontrolled propagation of TEs could wreak havoc on the genome, many regulatory mechanisms have evolved to silence their spread, including DNA methylation, histone modifications, non-coding RNAs (ncRNAs) including small interfering RNA (siRNA), chromatin remodelers, histone variants, and other epigenetic factors. Furthermore, TEs contribute to regulating the expression of other genes by serving as distal enhancers and transcription factor binding sites. The prevalence of interspersed elements in the genome has garnered attention for more research on their origins and functions. Some specific interspersed elements have been characterized, such as the Alu repeat and LINE1. Intrachromosomal recombination Homologous recombination between chromosomal repeated sequences in somatic cells of Nicotiana tabacum was found to be increased by exposure to mitomycin C, a bifunctional alkylating agent that crosslinks DNA strands. This increase in recombination was attributed to increased intrachromosomal recombinational repair. ==Evolutionary emergence of meiosis==
Evolutionary emergence of meiosis
The evolutionary origin of meiotic sexual reproduction is regarded as a long-standing evolutionary enigma. In prokaryotes, lateral gene transfer emerged as an early evolved form of sexual interaction. However, repeat sequences in prokaryotic DNA limit the effectiveness of lateral gene transfer at purging deleterious mutations, as well as limiting the accurate repair of DNA damages by homologous recombination. Colnoghi et al. proposed that such constraints on the beneficial effects of sexual interaction in prokaryotes favored the evolution of meiotic sex and thus the emergence of eukaryotes. It was concluded that the transition to homologous pairing along linear chromosomes that occurs during meiosis was the crucial innovation in meiotic sexual reproduction, and this innovation was instrumental in the evolutionary expansion of eukaryotic genomes that facilitated increased functional and morphological complexity. == Repeated sequences in human disease ==
Repeated sequences in human disease
For humans, some repeated DNA sequences are associated with diseases. Specifically, tandem repeat sequences, underlie several human disease conditions, particularly trinucleotide repeat diseases such as Huntington's disease, fragile X syndrome, several spinocerebellar ataxias, myotonic dystrophy and Friedreich's ataxia. Trinucleotide repeat expansions in the germline over successive generations can lead to increasingly severe manifestations of the disease. These trinucleotide repeat expansions may occur through strand slippage during DNA replication or during DNA repair synthesis. Faulty repair of DNA damages in repeat sequences may cause further expansion of these sequences, thus setting up a vicious cycle of pathology. otherwise known as cell death, and repair of oxidative DNA damage. In Huntington's disease the expansion of the trinucleotide sequence CAG encodes for a mutant huntingtin protein with an expanded polyglutamine domain. This domain causes the protein to form aggregates in nerve cells preventing normal cellular function and resulting in neurodegeneration. Fragile X syndrome Fragile X syndrome is caused by the expansion of the DNA sequence CCG in the FMR1 gene on the X chromosome. This gene produces the RNA-binding protein FMRP. In the case of Fragile X syndrome the repeated sequence makes the gene unstable and therefore silences the gene FMR1. Because the gene resides on the X chromosome, females who have two X chromosomes are less effected than males who only have on X chromosome and one Y chromosome because the second X chromosome can compensate for the silencing of the gene on the other X chromosome. Spinocerebellar ataxias The disease spinocerebellar ataxias has CAG trinucleotide repeat sequences that underlie several types of spinocerebellar ataxias (SCAs-SCA1; SCA2; SCA3; SCA6; SCA7; SCA12; SCA17). Similar to Huntington's disease, the polyglutamine tail created due to this trinucleotide expansion causes aggregation of proteins, preventing normal cellular function and causing neurodegeneration. Friedreich's Ataxia Friedreich's ataxia is a type of ataxia that has an expanded repeat sequence GAA in the frataxin gene. The frataxin gene is responsible for producing the frataxin protein, which is a mitochondrial protein involved in energy production and cellular respiration. The expanded GAA sequence results in the silencing of the first intron resulting in loss of function in the frataxin protein. The loss of a functional FXN gene leads to issues with mitochondrial functioning as a whole and can present phenotypically in patients as difficulty walking. Myotonic dystrophy Myotonic dystrophy is a disorder that presents as muscle weakness and consists of two main types: DM1 and DM2. Both types of myotonic dystrophy are due to expanded DNA sequences. In DM1 the DNA sequence that is expanded is CTG while in DM2 it is CCTG. These two sequences are found on different genes with the expanded sequence in DM2 being found on the ZNF9 gene and the expanded sequence in DM1 found on the DMPK gene. The two genes don't encode for proteins unlike other disorders like Huntington's disease or Fragile X syndrome. It has been shown, however, that there is a link between RNA toxicity and the repeat sequences in DM1 and DM2. Amyotrophic lateral sclerosis and Frontotemporal dementia Not all diseases caused by repeated DNA sequences are trinucleotide repeat diseases. The diseases amyotrophic lateral sclerosis and frontotemporal dementia are caused by hexanucleotide GGGGCC repeat sequences in the C9orf72 gene, causing RNA toxicity that leads to neurodegeneration. == Biotechnology ==
Biotechnology
Repetitive DNA is hard to sequence using next-generation sequencing techniques because sequence assembly from short reads simply cannot determine the length of a repetitive part. This issue is particularly serious for microsatellites, which are made of tiny 1-6bp repeat units. Although they are difficult to sequence, these short repeats have great value in DNA fingerprinting and evolutionary studies. Many researchers have historically left out repetitive sequences when analyzing and publishing whole genome data due to technical limitations. Bustos. et al. proposed one method of sequencing long stretches of repetitive DNA. The method combines the use of a linear vector for stabilization and exonuclease III for deletion of continuing simple sequence repeats (SSRs) rich regions. First, SSR-rich fragments are cloned into a linear vector that can stably incorporate tandem repeats up to 30kb. Expression of repeats is prohibited by the transcriptional terminators in the vector. The second step involves the use of exonuclease III. The enzyme can delete nucleotide at the 3' end which results in the production of a unidirectional deletion of SSR fragments. Finally, this product which has deleted fragments is multiplied and analyzed with colony PCR. The sequence is then built by an ordered sequencing of a set of clones containing different deletions. == See also ==
tickerdossier.comtickerdossier.substack.com