Single nucleotide polymorphisms serve as powerful molecular markers in contemporary genetic research and clinical practice. Association studies, particularly
genome-wide association studies (GWAS), represent the primary application of SNP technology for identifying genetic variants linked to human diseases and traits. These comprehensive analyses examine hundreds of thousands of genetic markers simultaneously to detect
statistical associations between specific SNPs and phenotypic characteristics, enabling researchers to uncover genetic contributions to
complex disorders including
cardiovascular disease,
diabetes, and
neurological conditions. The development of tag SNP methodology has significantly enhanced the efficiency of genomic studies by exploiting patterns of
linkage disequilibrium across the
human genome. Tag SNPs function as representative markers that capture genetic variation within specific chromosomal regions, allowing researchers to survey large genomic areas without
genotyping every individual variant. This approach reduces both the financial cost and computational burden of large-scale genetic studies while maintaining sufficient power to detect disease-associated loci. The selection of optimal tag SNPs relies on sophisticated algorithms that identify markers capable of capturing the maximum amount of genetic information within defined genomic intervals.
Haplotype reconstruction represents another fundamental application where SNPs enable the characterization of inherited genetic blocks. Researchers utilize dense SNP maps to identify and analyze haplotype structures, which consist of sets of closely linked alleles that tend to be transmitted together through generations. These haplotype patterns provide insights into population history, demographic events, and evolutionary processes that have shaped contemporary genetic diversity. The
International HapMap Project exemplified this application by creating comprehensive maps of common haplotype patterns across diverse human populations.
Linkage disequilibrium analysis forms the theoretical foundation for many SNP-based applications in population genetics and disease mapping. This phenomenon describes the non-random association of alleles at different genomic positions, which occurs when variants are inherited together more frequently than would be expected by chance alone. The extent of linkage disequilibrium between SNPs depends primarily on physical distance along
chromosomes and local
recombination rates, with closer variants generally showing stronger associations. Understanding these patterns enables researchers to predict which SNPs will provide redundant information and guides the selection of informative markers for association studies. In
genetic epidemiology, SNPs have emerged as essential tools for investigating disease transmission patterns and population structure.
Whole-genome sequencing approaches utilize SNP variation to define transmission clusters in infectious disease outbreaks, where cases showing similar genetic profiles may represent linked transmission events. This application has proven particularly valuable for
tuberculosis surveillance and contact tracing, where traditional epidemiological methods may fail to identify all transmission links. Additionally, SNP-based analyses contribute to understanding
population stratification and ancestry, which are crucial factors in designing appropriate study controls and interpreting association results across diverse ethnic groups.
Importance Variations in the DNA sequences of humans can affect how humans develop
diseases and respond to
pathogens,
chemicals,
drugs,
vaccines, and other agents. SNPs are also critical for
personalized medicine. Examples include biomedical research, forensics, pharmacogenetics, and disease causation, as outlined below.
Clinical research Genome-wide association study (GWAS) One of the main contributions of SNPs in clinical research is genome-wide association study (GWAS). Genome-wide genetic data can be generated by multiple technologies, including SNP array and whole genome sequencing. GWAS has been commonly used in identifying SNPs associated with diseases or clinical phenotypes or traits. Since GWAS is a genome-wide assessment, a large sample site is required to obtain sufficient statistical power to detect all possible associations. Some SNPs have relatively small effect on diseases or clinical phenotypes or traits. To estimate study power, the genetic model for disease needs to be considered, such as dominant, recessive, or additive effects. Due to genetic heterogeneity, GWAS analysis must be adjusted for race.
Candidate gene association study Candidate gene association study is commonly used in genetic study before the invention of high throughput genotyping or sequencing technologies. Candidate gene association study is to investigate limited number of pre-specified SNPs for association with diseases or clinical phenotypes or traits. So this is a hypothesis driven approach. Since only a limited number of SNPs are tested, a relatively small sample size is sufficient to detect the association. Candidate gene association approach is also commonly used to confirm findings from GWAS in independent samples.
Homozygosity mapping in disease Genome-wide SNP data can be used for homozygosity mapping. Homozygosity mapping is a method used to identify homozygous autosomal recessive loci, which can be a powerful tool to map genomic regions or genes that are involved in disease pathogenesis.
Methylation patterns Recently, preliminary results reported SNPs as important components of the epigenetic program in organisms. Moreover, cosmopolitan studies in European and South Asiatic populations have revealed the influence of SNPs in the methylation of specific CpG sites. In addition, meQTL enrichment analysis using GWAS database, demonstrated that those associations are important toward the prediction of biological traits.
Forensic sciences SNPs have historically been used to match a forensic DNA sample to a suspect but has been made obsolete due to advancing
STR-based
DNA fingerprinting techniques. However, the development of
next-generation-sequencing (NGS) technology may allow for more opportunities for the use of SNPs in phenotypic clues such as
ethnicity,
hair color, and
eye color with a good probability of a match. This can additionally be applied to increase the accuracy of facial reconstructions by providing information that may otherwise be unknown, and this information can be used to help identify suspects even without a STR
DNA profile match. Some cons to using SNPs versus STRs is that SNPs yield less information than STRs, and therefore more SNPs are needed for analysis before a profile of a suspect is able to be created. Additionally, SNPs heavily rely on the presence of a database for comparative analysis of samples. However, in instances with degraded or small volume samples, SNP techniques are an excellent alternative to STR methods. SNPs (as opposed to STRs) have an abundance of potential markers, can be fully automated, and a possible reduction of required fragment length to less than 100 bp. Many drug metabolizing enzymes, drug targets, or target pathways can be influenced by SNPs. The SNPs involved in drug metabolizing enzyme activities can change drug pharmacokinetics, while the SNPs involved in drug target or its pathway can change drug
pharmacodynamics. Therefore, SNPs are potential genetic markers that can be used to predict drug exposure or effectiveness of the treatment. Genome-wide pharmacogenetic study is called
pharmacogenomics. Pharmacogenetics and pharmacogenomics are important in the development of precision medicine, especially for life-threatening diseases such as cancers.
Disease Only small amount of SNPs in the human genome may have impact on human diseases. Large scale GWAS has been done for the most important human diseases, including
heart diseases,
metabolic diseases,
autoimmune diseases, and
neurodegenerative and
psychiatric disorders. == Examples ==