Genomics, transcriptomics and proteomics The AAV genome is built of single-stranded deoxyribonucleic acid (ss
DNA), either positive- or negative-sensed, which is about 4.7 kilobase long. The genome comprises ITRs at both ends of the DNA strand, and two
open reading frames (ORFs):
rep and
cap. The former is composed of four
overlapping genes encoding Rep proteins required for the AAV life cycle, and the latter contains overlapping nucleotide sequences of
capsid proteins: VP1, VP2 and VP3, which interact to form a capsid with icosahedral symmetry.
ITR sequences The inverted terminal repeat (ITR) sequences comprise 145 bases each. They were named so because of their symmetry, which was shown to be required for efficient multiplication of the AAV genome. The feature of these sequences that gives them this property is their ability to form a
hairpin, which contributes to so-called self-priming that allows
primase-independent synthesis of the second DNA strand. The ITRs were also shown to be required for both integration of the AAV DNA into the host cell genome (19th chromosome in humans) and rescue from it, as well as for efficient
encapsidation of the AAV DNA combined with generation of a fully assembled,
deoxyribonuclease-resistant AAV particles. With regard to gene therapy, ITRs seem to be the only sequences required
in cis next to the therapeutic gene: structural (
cap) and packaging (
rep) proteins can be delivered
in trans. With this assumption many methods were established for efficient production of recombinant AAV (rAAV) vectors containing a
reporter or therapeutic gene. However, it was also published that the ITRs are not the only elements required
in cis for the effective replication and encapsidation. A few research groups have identified a sequence designated
cis-acting Rep-dependent element (CARE) inside the coding sequence of the
rep gene. CARE was shown to augment the replication and encapsidation when present
in cis.
rep gene and Rep proteins On the "left side" of the genome there are two
promoters called p5 and p19, from which two overlapping messenger ribonucleic acids (
mRNAs) of different length can be produced. Each of these contains an
intron which can be either
spliced out or not. Given these possibilities, four various mRNAs, and consequently four various Rep proteins with overlapping sequence can be synthesized. Their names depict their sizes in
kilodaltons (kDa): Rep78, Rep68, Rep52 and Rep40. Rep78 and 68 can specifically bind the
hairpin formed by the ITR in the self-priming act and cleave at a specific region, designated terminal resolution site, within the hairpin. They were also shown to be necessary for the AAVS1-specific integration of the AAV genome. All four Rep proteins were shown to bind
ATP and to possess
helicase activity. It was also shown that they upregulate the transcription from the p40 promoter (mentioned below), but downregulate both p5 and p19 promoters.
cap gene and VP proteins The right side of a positive-sensed AAV genome encodes overlapping sequences of three capsid proteins, VP1, VP2 and VP3, and two accessory proteins, MAAP & AAP, which start from one promoter, designated p40. The molecular weights of these proteins are 87, 72 and 62 kiloDaltons, respectively. The AAV capsid is composed of a mixture of VP1, VP2, and VP3 totaling 60 monomers arranged in
icosahedral symmetry in a ratio of 1:1:10, with an empty mass of approximately 3.8
MDa. The
crystal structure of the VP3 protein was determined by Xie, Bue,
et al. , with the back half hidden for clarity. One fivefold symmetry axis is shown center. The
cap gene produces an additional, non-structural protein called the Assembly-Activating Protein (AAP). This protein is produced from ORF2 and is essential for the capsid-assembly process. The exact function of this protein in the assembly process and its structure have not been solved to date. All three VPs are translated from one mRNA. After this mRNA is synthesized, it can be spliced in two different manners: either a longer or shorter intron can be excised resulting in the formation of two pools of mRNAs: a 2.3 kb- and a 2.6 kb-long mRNA pool. Usually, especially in the presence of adenovirus, the longer intron is preferred, so the 2.3-kb-long mRNA represents the so-called "major splice". In this form the first
AUG codon, from which the synthesis of VP1 protein starts, is cut out, resulting in a reduced overall level of VP1 protein synthesis. The first AUG codon that remains in the major splice is the initiation codon for VP3 protein. However, upstream of that codon in the same open reading frame lies an ACG sequence (encoding threonine) which is surrounded by an optimal
Kozak context. This contributes to a low level of synthesis of VP2 protein, which is actually VP3 protein with additional N terminal residues, as is VP1. Since the bigger intron is preferred to be spliced out, and since in the major splice the ACG codon is a much weaker
translation initiation signal, the ratio at which the AAV structural proteins are synthesized
in vivo is about 1:1:20, which is the same as in the mature virus particle. The unique fragment at the N terminus of VP1 protein was shown to possess the
phospholipase A2 (PLA2) activity, which is probably required for the releasing of AAV particles from late
endosomes. Muralidhar
et al. reported that VP2 and VP3 are crucial for correct virion assembly.
Post-translational modifications Recent discoveries made through use of high-throughput 'omics approaches include the fact that AAV capsids are post-translationally modified (PTM) during production such as
acetylation, methylation,
phosphorylation, deamidation, O-GlycNAcylation and SUMOylation throughout capsid proteins VP1, VP2 and VP3. These PTMs differ depending on the manufacturing production platform. Another such discovery is the fact that AAV genomes are epigenetically methylated during production. Besides price, these findings might affect expression kinetics, rAAV receptor binding, trafficking, vector immunogenicity, and expression durability. ==Classification, serotypes, receptors and native tropism==