s shown as dotted lines. Each end of the double helix has an exposed
5' phosphate on one strand and an exposed
3′ hydroxyl group (—OH) on the other. DNA is a long
polymer made from repeating units called
nucleotides. The structure of DNA is dynamic along its length, being capable of coiling into tight loops and other shapes. In all species it is composed of two helical chains, bound to each other by
hydrogen bonds. Both chains are coiled around the same axis, and have the same
pitch of . The pair of chains have a radius of . According to another study, when measured in a different solution, the DNA chain measured wide, and one nucleotide unit measured long. The buoyant density of most DNA is 1.7g/cm3. DNA does not usually exist as a single strand, but instead as a pair of strands that are held tightly together. These two long strands coil around each other, in the shape of a
double helix. The nucleotide contains both a segment of the
backbone of the molecule (which holds the chain together) and a
nucleobase (which interacts with the other DNA strand in the helix). A nucleobase linked to a sugar is called a
nucleoside, and a base linked to a sugar and to one or more phosphate groups is called a
nucleotide. A
biopolymer comprising multiple linked nucleotides (as in DNA) is called a
polynucleotide. The backbone of the DNA strand is made from alternating
phosphate and
sugar groups. The sugar in DNA is
2-deoxyribose, which is a
pentose (five-
carbon) sugar. The sugars are joined by phosphate groups that form
phosphodiester bonds between the third and fifth carbon
atoms of adjacent sugar rings. These are known as the
3′-end (three prime end), and
5′-end (five prime end) carbons, the prime symbol being used to distinguish these carbon atoms from those of the base to which the deoxyribose forms a
glycosidic bond. (
animated version). The DNA double helix is stabilized primarily by two forces:
hydrogen bonds between nucleotides and
base-stacking interactions among
aromatic nucleobases. The four bases found in DNA are
adenine (),
cytosine (),
guanine () and
thymine (). These four bases are attached to the sugar-phosphate to form the complete nucleotide, as shown for
adenosine monophosphate. Adenine pairs with thymine and guanine pairs with cytosine, forming and
base pairs.
Nucleobase classification The nucleobases are classified into two types: the
purines, and , which are fused five- and six-membered
heterocyclic compounds, and the
pyrimidines, the six-membered rings and .
Non-canonical bases Modified bases occur in DNA. The first of these recognized was
5-methylcytosine, which was found in the
genome of
Mycobacterium tuberculosis in 1925. The reason for the presence of these noncanonical bases in bacterial viruses (
bacteriophages) is to avoid the
restriction enzymes present in bacteria. This enzyme system acts at least in part as a molecular immune system protecting bacteria from infection by viruses. Modifications of the bases cytosine and adenine, the more common and modified DNA bases, play vital roles in the
epigenetic control of gene expression in plants and animals. A number of noncanonical bases are known to occur in DNA. Most of these are modifications of the canonical bases plus uracil. • Modified
Adenine • N6-carbamoyl-methyladenine • N6-methyadenine • Modified
Guanine • 7-Deazaguanine • 7-Methylguanine • Modified
Cytosine • N4-Methylcytosine • 5-Carboxylcytosine • 5-Formylcytosine • 5-Glycosylhydroxymethylcytosine • 5-Hydroxycytosine • 5-Methylcytosine • Modified
Thymidine • α-Glutamythymidine • α-Putrescinylthymine •
Uracil and modifications •
Base J • Uracil • 5-Dihydroxypentauracil • 5-Hydroxymethyldeoxyuracil • Others • Deoxyarchaeosine • 2,6-Diaminopurine (2-Aminoadenine)
Grooves dye 33258. Twin helical strands form the DNA backbone. Another double helix may be found tracing the spaces, or grooves, between the strands. These voids are adjacent to the base pairs and may provide a
binding site. As the strands are not symmetrically located with respect to each other, the grooves are unequally sized. The major groove is wide, while the minor groove is in width. Due to the larger width of the major groove, the edges of the bases are more accessible in the major groove than in the minor groove. As a result, proteins such as
transcription factors that can bind to specific sequences in double-stranded DNA usually make contact with the sides of the bases exposed in the major groove. This situation varies in unusual conformations of DNA within the cell
(see below), but the major and minor grooves are always named to reflect the differences in width that would be seen if the DNA was twisted back into the ordinary
B form.
Base pairing Top, a '
base pair with three hydrogen bonds. Bottom, an ' base pair with two hydrogen bonds. Non-covalent hydrogen bonds between the pairs are shown as dashed lines. In a DNA double helix, each type of nucleobase on one strand bonds with just one type of nucleobase on the other strand. This is called
complementary base pairing. Purines form
hydrogen bonds to pyrimidines, with adenine bonding only to thymine in two hydrogen bonds, and cytosine bonding only to guanine in three hydrogen bonds. This arrangement of two nucleotides binding together across the double helix (from six-carbon ring to six-carbon ring) is called a Watson-Crick base pair. DNA with high
GC-content is more stable than DNA with low -content. A
Hoogsteen base pair (hydrogen bonding the 6-carbon ring to the 5-carbon ring) is a rare variation of base-pairing. As hydrogen bonds are not
covalent, they can be broken and rejoined relatively easily. The two strands of DNA in a double helix can thus be pulled apart like a zipper, either by a mechanical force or high
temperature. As a result of this base pair complementarity, all the information in the double-stranded sequence of a DNA helix is duplicated on each strand, which is vital in DNA replication. This reversible and specific interaction between complementary base pairs is critical for all the functions of DNA in organisms. In biology, parts of the DNA double helix that need to separate easily, such as the
Pribnow box in some
promoters, tend to have a high content, making the strands easier to pull apart. In the laboratory, the strength of this interaction can be measured by finding the melting temperature
Tm necessary to break half of the hydrogen bonds. When all the base pairs in a DNA double helix melt, the strands separate and exist in solution as two entirely independent molecules. These single-stranded DNA molecules have no single common shape, but some conformations are more stable than others.
Amount of a human. It shows 22
homologous chromosomes, both the female (XX) and male (XY) versions of the
sex chromosome (bottom right), as well as the
mitochondrial genome (to scale at bottom left). The blue scale to the left of each chromosome pair (and the mitochondrial genome) shows its length in terms of millions of DNA
base pairs. In humans, the total female
diploid nuclear genome per cell extends for 6.37 Gigabase pairs (Gbp), is 208.23 cm long and weighs 6.51 picograms (pg). Male values are 6.27 Gbp, 205.00 cm, 6.41 pg. DNA base pairs, with each such molecule normally containing a full set of the mitochondrial genes. Each human mitochondrion contains, on average, approximately 5 such mtDNA molecules.
Sense and antisense A
DNA sequence is called a "sense" sequence if it is the same as that of a
messenger RNA copy that is translated into protein. The sequence on the opposite strand is called the "antisense" sequence. Both sense and antisense sequences can exist on different parts of the same strand of DNA (i.e. both strands can contain both sense and antisense sequences). In both prokaryotes and eukaryotes, antisense RNA sequences are produced, but the functions of these RNAs are not entirely clear. One proposal is that antisense RNAs are involved in regulating
gene expression through RNA-RNA base pairing. A few DNA sequences in prokaryotes and eukaryotes, and more in
plasmids and
viruses, blur the distinction between sense and antisense strands by having
overlapping genes. In these cases, some DNA sequences do double duty, encoding one protein when read along one strand, and a second protein when read in the opposite direction along the other strand. In
bacteria, this overlap may be involved in the regulation of gene transcription, while in viruses, overlapping genes increase the amount of information that can be encoded within the small viral genome.
Supercoiling DNA can be twisted like a rope in a process called
DNA supercoiling. With DNA in its "relaxed" state, a strand usually circles the axis of the double helix once every 10.4 base pairs, but if the DNA is twisted the strands become more tightly or more loosely wound. If the DNA is twisted in the direction of the helix, this is positive supercoiling, and the bases are held more tightly together. If they are twisted in the opposite direction, this is negative supercoiling, and the bases come apart more easily. In nature, most DNA has slight negative supercoiling that is introduced by
enzymes called
topoisomerases. These enzymes are also needed to relieve the twisting stresses introduced into DNA strands during processes such as
transcription and
DNA replication.
Alternative DNA structures ,
B and
Z-DNA DNA exists in many possible
conformations that include
A-DNA,
B-DNA, and
Z-DNA forms, although only B-DNA and Z-DNA have been directly observed in functional organisms. The first published reports of A-DNA
X-ray diffraction patterns—and also B-DNA—used analyses based on
Patterson functions that provided only a limited amount of structural information for oriented fibers of DNA. An alternative analysis was proposed by Wilkins
et al. in 1953 for the
in vivo B-DNA X-ray diffraction-scattering patterns of highly hydrated DNA fibers in terms of squares of
Bessel functions. In the same journal,
James Watson and
Francis Crick presented their
molecular modeling analysis of the DNA X-ray diffraction patterns to suggest that the structure was a double helix. it is not a well-defined conformation but a family of related DNA conformations that occur at the high hydration levels present in cells. Their corresponding X-ray diffraction and scattering patterns are characteristic of molecular
paracrystals with a significant degree of disorder. Compared to B-DNA, the A-DNA form is a wider
right-handed spiral, with a shallow, wide minor groove and a narrower, deeper major groove. The A form occurs under non-physiological conditions in partly dehydrated samples of DNA, while in the cell it may be produced in hybrid pairings of DNA and RNA strands, and in enzyme-DNA complexes. Segments of DNA where the bases have been chemically modified by
methylation may undergo a larger change in conformation and adopt the
Z form. Here, the strands turn about the helical axis in a left-handed spiral, the opposite of the more common B form. These unusual structures can be recognized by specific Z-DNA binding proteins and may be involved in the regulation of transcription.
Alternative DNA chemistry For many years,
exobiologists have proposed the existence of a
shadow biosphere, a postulated microbial
biosphere of Earth that uses radically different biochemical and molecular processes than currently known life. One of the proposals was the existence of lifeforms that use
arsenic instead of phosphorus in DNA. A report in 2010 of the possibility in the
bacterium GFAJ-1 was announced, though the research was disputed, and evidence suggests the bacterium actively prevents the incorporation of arsenic into the DNA backbone and other biomolecules.
Quadruplex structures repeats. The looped conformation of the DNA backbone is very different from the typical DNA helix. The green spheres in the center represent potassium ions. At the ends of the linear chromosomes are specialized regions of DNA called
telomeres. The main function of these regions is to allow the cell to replicate chromosome ends using the enzyme
telomerase, as the enzymes that normally replicate DNA cannot copy the extreme 3′ ends of chromosomes. These specialized chromosome caps also help protect the DNA ends, and stop the
DNA repair systems in the cell from treating them as damage to be corrected. In
human cells, telomeres are usually lengths of single-stranded DNA containing several thousand repeats of a simple TTAGGG sequence. These guanine-rich sequences may stabilize chromosome ends by forming structures of stacked sets of four-base units, rather than the usual base pairs found in other DNA molecules. Here, four guanine bases, known as a
guanine tetrad, form a flat plate. These flat four-base units then stack on top of each other to form a stable
G-quadruplex structure. These structures are stabilized by hydrogen bonding between the edges of the bases and
chelation of a metal ion in the centre of each four-base unit. Other structures can also be formed, with the central set of four bases coming from either a single strand folded around the bases, or several different parallel strands, each contributing one base to the central structure. In addition to these stacked structures, telomeres also form large loop structures called telomere loops, or T-loops. Here, the single-stranded DNA curls around in a long circle stabilized by telomere-binding proteins. At the very end of the T-loop, the single-stranded telomere DNA is held onto a region of double-stranded DNA by the telomere strand disrupting the double-helical DNA and base pairing to one of the two strands. This
triple-stranded structure is called a displacement loop or
D-loop. Branched DNA can be used in
nanotechnology to construct geometric shapes, see the section on
uses in technology below.
Artificial bases Several artificial nucleobases have been synthesized, and successfully incorporated in the eight-base DNA analogue named
Hachimoji DNA. Dubbed S, B, P, and Z, these artificial bases are capable of bonding with each other in a predictable way (S–B and P–Z), maintain the double helix structure of DNA, and be transcribed to RNA. Their existence could be seen as an indication that there is nothing special about the four natural nucleobases that evolved on Earth. On the other hand, DNA is tightly related to
RNA which does not only act as a transcript of DNA but also performs as molecular machines many tasks in cells. For this purpose it has to fold into a structure. It has been shown that to allow to create all possible structures at least four bases are required for the corresponding
RNA, while a higher number is also possible but this would be against the natural
principle of least effort.
Acidity The phosphate groups of DNA give it similar
acidic properties to
phosphoric acid and it can be considered as a
strong acid. It will be fully ionized at a normal cellular pH, releasing
protons which leave behind negative charges on the phosphate groups. These negative charges protect DNA from breakdown by
hydrolysis by repelling
nucleophiles which could hydrolyze it.
Macroscopic appearance Pure DNA extracted from cells forms white, stringy clumps. == Chemical modifications and altered DNA packaging ==