Hepadnaviruses have very small
genomes of partially double-stranded, partially single stranded circular
DNA (pdsDNA). The genome consists of two strands, a longer negative-sense strand and a shorter and positive-sense strand of variable length. In the virion these strands are arranged such that the two ends of the long strand meet but are not covalently bonded together. The shorter strand overlaps this divide and is connected to the longer strand on either side of the split through a direct repeat (DR) segment that pairs the two strands together. In replication, the viral pdsDNA is converted in the host cell nucleus to covalently-closed-circular DNA (cccDNA) by the viral polymerase. Replication involves an
RNA intermediate, as in viruses belonging to group VII of
Baltimore classification. Four main
open reading frames are encoded (ORFs) and the virus has four known genes which encode seven proteins: the core capsid protein, the
viral polymerase, surface
antigens—preS1, preS2, and S, the X protein and HBeAg. The X protein is thought to be non-structural. Its function and significance are poorly understood but it is suspected to be associated with host gene expression modulation.
Viral polymerase Members of the family
Hepadnaviridae encode their own polymerase, rather than co-opting host machinery as some other viruses do. This enzyme is unique among viral polymerases in that it has reverse transcriptase activity to convert RNA into DNA to replicate the genome (the only other human-pathogenic virus family encoding a polymerase with this capability is
Retroviridae), RNAse activity (used when the DNA genome is synthesized from pgRNA that was packaged in virions for replication to destroy the RNA template and produce the pdsDNA genome), and DNA-dependent-DNA-polymerase activity (used to create cccDNA from pdsDNA in the first step of the replication cycle).
Envelope proteins The hepatitis envelope proteins are composed of subunits made from the viral preS1, preS2, and S genes. The L (for "large") envelope protein contains all three subunits. The M (for "medium") protein contains only preS2 and S. The S (for "small") protein contains only S. The genome portions encoding these envelope protein subunits share both the same frame and the same stop codon, generating nested transcripts on a single open reading frame. The pre-S1 is encoded first (closest to the 5' end), followed directly by the pre-S2 and the S. When a transcript is made from the beginning of the pre-S1 region, all three genes are included in the transcript and the L protein is produced. When the transcript starts after the pro-S1 at the beginning of the pre-S2 the final protein contains the pre-S2 and S subunits only and therefore is an M protein. The smallest envelope protein containing just the S subunit is made most because it is encoded closest to the 3' end and comes from the shortest transcript. These envelope proteins can assemble independently of the viral capsid and genome into non-infectious virus-like particles that give the virus a pleomorphic appearance and promote a strong immune response in hosts. ==Replication==