SARS‑CoV‑2 belongs to the broad family of viruses known as
coronaviruses. It is a
positive-sense single-stranded RNA (+ssRNA) virus, with a single linear RNA segment. Coronaviruses infect humans, other mammals, including livestock and companion animals, and avian species. Human coronaviruses can cause illnesses ranging from the
common cold to more severe diseases such as
Middle East respiratory syndrome (MERS, fatality rate ~34%). SARS-CoV-2 is the seventh known coronavirus to infect people, after
229E,
NL63,
OC43,
HKU1,
MERS-CoV, and the original
SARS-CoV-1. Like the SARS-related coronavirus implicated in the 2003 SARS outbreak, SARS‑CoV‑2 is a member of the subgenus
Sarbecovirus (
beta-CoV lineage B). The mechanism of recombination in unsegmented RNA viruses such as SARS-CoV-2 is generally by copy-choice replication, in which gene material switches from one RNA template molecule to another during replication. The SARS-CoV-2 RNA sequence is approximately 30,000
bases in length, Its genome consists nearly entirely of protein-coding sequences, a trait shared with other coronaviruses. which appears to be an important element enhancing its virulence. It was suggested that the acquisition of the furin-cleavage site in the SARS-CoV-2 S protein was essential for zoonotic transfer to humans. The furin
protease recognizes the canonical
peptide sequence
RX
R/
K]
R↓X where the cleavage site is indicated by a down arrow and X is any
amino acid. In SARS-CoV-2 the recognition site is formed by the incorporated 12
codon nucleotide sequence CCT CGG CGG GCA which corresponds to the amino acid sequence
P RR A. Although such sites are a common naturally-occurring feature of other viruses within the Subfamily Orthocoronavirinae, it appears in few other viruses from the
Beta-CoV genus, and it is unique among members of its subgenus for such a site. The furin cleavage site PRRAR↓ is highly similar to that of the
feline coronavirus, an
alphacoronavirus 1 virus. Viral genetic sequence data can provide critical information about whether viruses separated by time and space are likely to be epidemiologically linked. With a sufficient number of sequenced
genomes, it is possible to reconstruct a
phylogenetic tree of the mutation history of a family of viruses. By 12 January 2020, five genomes of SARS‑CoV‑2 had been isolated from Wuhan and reported by the
Chinese Center for Disease Control and Prevention (CCDC) and other institutions; the number of genomes increased to 42 by 30 January 2020. A phylogenetic analysis of those samples showed they were "highly related with at most seven mutations relative to a
common ancestor", implying that the first human infection occurred in November or December 2019. 3,422 SARS‑CoV‑2 genomes, belonging to 19 strains, sampled on all continents except Antarctica were publicly available. On 11 February 2020, the
International Committee on Taxonomy of Viruses announced that according to existing rules that compute hierarchical relationships among coronaviruses based on five
conserved sequences of nucleic acids, the differences between what was then called 2019-nCoV and the virus from the 2003 SARS outbreak were insufficient to make them separate
viral species. Therefore, they identified 2019-nCoV as a virus of
severe acute respiratory syndrome–related coronavirus. In July 2020, scientists reported that a more infectious SARS‑CoV‑2 variant with
spike protein variant G614 has replaced D614 as the dominant form in the pandemic. Coronavirus genomes and subgenomes encode six
open reading frames (ORFs). In October 2020, researchers discovered a possible
overlapping gene named
ORF3d, in the SARS‑CoV‑2
genome. It is unknown if the protein produced by
ORF3d has any function, but it provokes a strong immune response.
ORF3d has been identified before, in a variant of coronavirus that infects
pangolins.
Phylogenetic tree Variants of a B.1.1.7 variant coronavirus. The variant's increased transmissibility is believed to be due to changes in the structure of the spike proteins, shown here in green. There are many thousands of variants of SARS-CoV-2, which can be grouped into the much larger
clades. •
Alpha: Lineage B.1.1.7 emerged in the
United Kingdom in September 2020, with evidence of increased transmissibility and virulence. Notable mutations include
N501Y and
P681H. • An
E484K mutation in some lineage B.1.1.7 virions has been noted and is also tracked by various
public health agencies. •
Beta: Lineage B.1.351 emerged in
South Africa in May 2020, with evidence of increased transmissibility and changes to antigenicity, with some public health officials raising alarms about its impact on the efficacy of some vaccines. Notable mutations include
K417N, E484K and N501Y. •
Gamma: Lineage P.1 emerged in
Brazil in November 2020, also with evidence of increased transmissibility and virulence, alongside changes to antigenicity. Similar concerns about vaccine efficacy have been raised. Notable mutations also include K417N, E484K and N501Y. •
Delta: Lineage B.1.617.2 emerged in
India in October 2020. There is also evidence of increased transmissibility and virulence, and changes to antigenicity. •
Omicron: Lineage B.1.1.529 emerged around
Botswana in November 2021. This lineage demonstrated significantly increased transmissibility and changes to antigenicity, and it subsequently dominated all circulating versions of the virus ever since its emergence. • Omicron variant
BA.3.2 was reported to be widespread across Europe and the US in April 2026. The variant has many changes to the spike protein, but had not been found to be more virulent than other Omicron variants, and existing vaccines protected against it. Other notable variants include 6 other WHO-designated
variants under investigation and
Cluster 5, which emerged among
mink in Denmark and resulted in a mink euthanasia campaign rendering it virtually extinct. == Virology ==