Mutation frequencies Whole genome sequencing has established the
mutation frequency for whole human genomes. The mutation frequency in the whole genome between generations for humans (parent to child) is about 70 new mutations per generation. An even lower level of variation was found comparing whole genome sequencing in blood cells for a pair of monozygotic (identical twins) 100-year-old centenarians. Only 8 somatic differences were found, though somatic variation occurring in less than 20% of blood cells would be undetected. In the specifically protein coding regions of the human genome, it is estimated that there are about 0.35 mutations that would change the protein sequence between parent/child generations (less than one mutated protein per generation). In cancer, mutation frequencies are much higher, due to
genome instability. This frequency can further depend on patient age, exposure to DNA damaging agents (such as UV-irradiation or components of tobacco smoke) and the activity/inactivity of DNA repair mechanisms. Furthermore, mutation frequency can vary between cancer types: in germline cells, mutation rates occur at approximately 0.023 mutations per megabase, but this number is much higher in breast cancer (1.18-1.66 somatic mutations per Mb), in lung cancer (17.7) or in melanomas (≈33). Since the haploid human genome consists of approximately 3,200 megabases, this translates into about 74 mutations (mostly in
noncoding regions) in germline DNA per generation, but 3,776-5,312 somatic mutations per haploid genome in breast cancer, 56,640 in lung cancer and 105,600 in melanomas. The distribution of somatic mutations across the human genome is very uneven, such that the gene-rich, early-replicating regions receive fewer mutations than gene-poor, late-replicating heterochromatin, likely due to differential DNA repair activity. In particular, the
histone modification H3K9me3 is associated with high, and
H3K36me3 with low mutation frequencies.
Genome-wide association studies In research, whole-genome sequencing can be used in a genome-wide association study (GWAS) – a project aiming to determine the genetic variant or variants associated with a disease or some other phenotype.
Diagnostic use In 2009,
Illumina released its first whole genome sequencers that were approved for clinical as opposed to research-only use and doctors at
academic medical centers began quietly using them to try to diagnose what was wrong with people whom standard approaches had failed to help. The first clinical example of application of this technology was in July 2009; and was an attempt to diagnose a child being treated at Milwaukee Children's Hospital who had early onset refractory IBD necessitating around 100 surgeries by the time he was three years old. His doctor, Allan Mayer, turned to Geneticist Dr. Howard Jacob at MCW to see if whole genome sequencing could be applied to determine the problem. Lead computational biologist Dr. Worthey, making use of software rapidly developed by a small team of software developers, was able to identify a rare mutation in the
XIAP that was causing the problems and this diagnosis altered the patients treatment, saving his life. By early 2010 Drs Jacob and Worthey, along with Dr. David Dimmock had stood up the first whole genome based genetics clinic and molecular diagnostic lab, routinely sequencing and analysing the genomes of patients with undiagnosed diseases. This diagnostic lab diagnose thousands of rare disease patients from around the world, continuing to this day. In 2010, a team from Stanford led by
Euan Ashley performed analysis of a full human genome, that of bioengineer Stephen Quake. In 2011, Ashley's team reported whole genome molecular autopsy and in 2011, extended the interpretation framework to a fully sequenced family, the West family, who were the first family to be sequenced on the Illumina platform. The price to sequence a genome at that time was $19,500USD, which was billed to the patient but usually paid for out of a research grant; one person at that time had applied for reimbursement from their insurance company.
Quality assessment schemes,
health technology assessment and
guidelines have to be in place. The 3Gb-TEST consortium has identified the analysis and interpretation of sequence data as the most complicated step in the diagnostic process. At the Consortium meeting in Athens in September 2014, the Consortium coined the word
genotranslation for this crucial step. This step leads to a so-called
genoreport. Guidelines are needed to determine the required content of these reports. Genomes2People (G2P), an initiative of
Brigham and Women's Hospital and
Harvard Medical School was created in 2011 to examine the integration of genomic sequencing into clinical care of adults and children. G2P's director,
Robert C. Green, had previously led the REVEAL study — Risk EValuation and Education for Alzheimer's Disease – a series of clinical trials exploring patient reactions to the knowledge of their genetic risk for Alzheimer's. In 2018, researchers at
Rady Children's Hospital Institute for Genomic Medicine in
San Diego determined that rapid whole-genome sequencing (rWGS) could diagnose genetic disorders in time to change acute medical or surgical management (clinical utility) and improve outcomes in acutely ill infants. In a retrospective cohort study of acutely ill inpatient infants in a regional children's hospital from July 2016-March 2017, forty-two families received rWGS for etiologic diagnosis of genetic disorders. The diagnostic sensitivity of rWGS was 43% (eighteen of 42 infants) and 10% (four of 42 infants) for standard genetic tests (P = .0005). The rate of clinical utility of rWGS (31%, thirteen of 42 infants) was significantly greater than for standard genetic tests (2%, one of 42; P = .0015). Eleven (26%) infants with diagnostic rWGS avoided morbidity, one had a 43% reduction in likelihood of mortality, and one started palliative care. In six of the eleven infants, the changes in management reduced inpatient cost by $800,000-$2,000,000. The findings replicated a prior study of the clinical utility of rWGS in acutely ill inpatient infants, and demonstrated improved outcomes, net healthcare savings and consideration as a first tier test in this setting. A 2018 review of 36 publications found the cost for whole genome sequencing to range from $1,906USD to $24,810USD and have a wide variance in diagnostic yield from 17% to 73% depending on patient groups.
Rare variant association study Whole genome sequencing studies enable the assessment of associations between complex traits and both coding and noncoding
rare variants (
minor allele frequency (MAF) < 1%) across the genome. Single-variant analyses typically have low power to identify associations with rare variants, and variant set tests have been proposed to jointly test the effects of given sets of multiple rare variants.
SNP annotations help to prioritize rare functional variants, and incorporating these annotations can effectively boost the power of genetic association of rare variants analysis of whole genome sequencing studies. Some tools have been specifically developed to provide all-in-one rare variant association analysis for whole-genome sequencing data, including integration of genotype data and their functional annotations, association analysis, result summary and visualization. Meta-analysis of whole genome sequencing studies provides an attractive solution to the problem of collecting large sample sizes for discovering rare variants associated with complex phenotypes. Some methods have been developed to enable functionally informed rare variant association analysis in biobank-scale cohorts using efficient approaches for summary statistic storage.
Newborn screening In 2013, Green and a team of researchers launched the BabySeq Project to study the ethical and medical consequences of sequencing a newborn's DNA. As of 2015, whole genome and exome sequencing as a
newborn screening tool were deliberated and in 2021, further discussed. In 2021, the NIH funded BabySeq2, an implementation study that expanded the BabySeq project, enrolling 500 infants from diverse families and track the effects of their genomic sequencing on their pediatric care. In 2023, the Lancet opined that in the UK "focusing on improving screening by upgrading targeted gene panels might be more sensible in the short term. Whole genome sequencing in the long term deserves thorough examination and universal caution." == Ethical concerns ==