topologies. The
evolutionary history of ORF8 is complex. It is among the least
conserved regions of the
Sarbecovirus genome. In SARS-CoV, the ORF8 region is thought to have originated through
recombination among ancestral
bat coronaviruses. Among the most distinctive features of this region in SARS-CoV is the emergence of a 29-
nucleotide deletion that split the full-length
open reading frame into two smaller ORFs, ORF8a and ORF8b.
Viral isolates from early in the
SARS epidemic have a full-length, intact ORF8, but the split structure emerged later in the epidemic. Mutations and deletions have also been seen in
SARS-CoV-2 variants. Based on observations in SARS-CoV, it has been suggested that changes in ORF8 may be related to
host adaptation, but it is possible that ORF8 does not affect
fitness in human hosts. In SARS-CoV, a high
dN/dS ratio has been observed in ORF8, consistent with
positive selection or with
relaxed selection. ORF8 encodes a protein whose
immunoglobulin domain (Ig) has distant similarity to that of
ORF7a. It has been suggested that ORF8 likely have evolved from ORF7a through
gene duplication, though some
bioinformatics analyses suggest the similarity may be too low to support duplication, which is relatively uncommon in viruses. Immunoglobulin domains are uncommon in coronaviruses; other than the subset of
betacoronaviruses with ORF8 and ORF7a, only a small number of bat
alphacoronaviruses have been identified as containing likely Ig domains, while they are absent from
gammacoronaviruses and
deltacoronaviruses. ORF8 is notably absent in
MERS-CoV. The beta and alpha Ig domains may be independent acquisitions, where ORF8 and ORF7a may have been acquired from host proteins. It is also possible that the absence of ORF8 reflects gene loss in those lineages. == References ==