MarketHuman genetic clustering
Company Profile

Human genetic clustering

Human genetic clustering refers to patterns of relative genetic similarity among human individuals and populations, as well as the wide range of scientific and statistical methods used to study this aspect of human genetic variation.

Methods
A wide range of methods have been developed to assess the structure of human populations with the use of genetic data. Early studies of within and between-group genetic variation used physical phenotypes and blood groups, with modern genetic studies using genetic markers such as Alu sequences, short tandem repeat polymorphisms, and single nucleotide polymorphisms (SNPs), among others. Models for genetic clustering also vary by algorithms and programs used to process the data. Most sophisticated methods for determining clusters can be categorized as model-based clustering methods (such as the algorithm STRUCTURE) or multidimensional summaries (typically through principal component analysis). By processing a large number of SNPs (or other genetic marker data) in different ways, both approaches to genetic clustering tend to converge on similar patterns by identifying similarities among SNPs and/or haplotype tracts to reveal ancestral genetic similarities. Where model-based clustering characterizes populations using proportions of presupposed ancestral clusters, multidimensional summary statistics characterize populations on a continuous spectrum. The most common multidimensional statistical method used for genetic clustering is principal component analysis (PCA), which plots individuals by two or more axes (their "principal components") that represent aggregations of genetic markers that account for the highest variance. Clusters can then be identified by visually assessing the distribution of data; with larger samples of human genotypes, data tends to cluster in distinct groups as well as admixed positions between groups. The creators of STRUCTURE originally described the algorithm as an "exploratory" method to be interpreted with caution and not as a test with statistically significant power. == Notable applications to human genetic data ==
Notable applications to human genetic data
Modern applications of genetic clustering methods to global-scale genetic data were first marked by studies associated with the Human Genome Diversity Project (HGDP) data. contributed to theories of the serial founder effect and early human migration out of Africa, and clustering methods have been notably applied to describe admixed continental populations. Genetic clustering and HGDP studies have also contributed to methods for, and criticisms of, the genetic ancestry consumer testing industry. A number of landmark genetic cluster studies have been conducted on global human populations since 2002, including the following: == Genetic clustering and race ==
Genetic clustering and race
Clusters of individuals are often geographically structured. For example, when clustering a population of East Asians and Europeans, each group will likely form its own respective cluster based on similar allele frequencies. In this way, clusters can have a correlation with traditional concepts of race and self-identified ancestry; in some cases, such as medical questionnaires, the latter variables can be used as a proxy for genetic ancestry where genetic data is unavailable. Researchers are careful to emphasize that ancestry—revealed in part through cluster analyses—plays an important role in understanding risk of disease. But racial or ethnic identity does not perfectly align with genetic ancestry, and so race and ethnicity do not reveal enough information to make a medical diagnosis. Race as a variable in medicine is more likely to reflect social factors, where ancestry information is more likely to be meaningful when considering genetic ancestry. == References ==
tickerdossier.comtickerdossier.substack.com