The power of twin designs arises from the fact that twins may be either identical (
monozygotic (MZ), i.e. developing from a single fertilized egg and therefore sharing all of their
polymorphic alleles) or fraternal (
dizygotic (DZ), i.e. developing from two fertilized eggs and therefore sharing on average 50% of their alleles, the same level of genetic similarity found in non-twin siblings). These known differences in genetic similarity, together with a testable assumption of equal environments for identical and fraternal twins, creates the basis for the design of twin studies aimed at estimating the overall effects of genes and environment on a phenotype. is shown: Model A on the left shows the raw variance in height. This is useful as it preserves the absolute effects of genes and environments, and expresses these in natural units, such as mm of height change. Sometimes it is helpful to standardize the parameters, so each is expressed as percentage of total variance. Because we have decomposed variance into A, C, and E, the total variance is simply A + C + E. We can then scale each of the single parameters as a proportion of this total, i.e., Standardised–A = A/(A + C + E). Heritability is the standardised genetic effect.
Model comparison A principal benefit of modeling is the ability to explicitly compare models: Rather than simply returning a value for each component, the modeler can compute
confidence intervals on parameters, but, crucially, can drop and add paths and test the effect via statistics such as the
AIC. Thus, for instance to test for predicted effects of family or shared environment on behavior, an AE model can be objectively compared to a full ACE model. For example, we can ask of the figure above for height: Can C (shared environment) be dropped without significant loss of fit? Alternatively,
confidence intervals can be calculated for each path.
Multi-group and multivariate modeling Multivariate modeling can give answers to questions about the genetic relationship between variables that appear independent. For instance: do IQ and long-term memory share genes? Do they share environmental causes? Additional benefits include the ability to deal with interval, threshold, and continuous data, retaining full information from data with missing values, integrating the latent modeling with measured variables, be they measured environments, or, now, measured molecular genetic markers such as
SNPs. In addition, models avoid constraint problems in the crude correlation method: all parameters will lie, as they should, between 0–1 (standardized). Multivariate, and multiple-time wave studies, with measured environment and repeated measures of potentially causal behaviours are now the norm. Examples of these models include extended twin designs, simplex models, and growth-curve models.
SEM programs such as
OpenMx and other applications suited to constraints and multiple groups have made the new techniques accessible to reasonably skilled users.
Modeling the environment: MZ discordant designs As MZ twins share both their genes and their family-level environmental factors, any differences between MZ twins reflect E: the unique environment. Researchers can use this information to understand the environment in powerful ways, allowing
epidemiological tests of causality that are otherwise typically confounded by factors such as gene–environment covariance,
reverse causation and
confounding. An example of a positive MZ discordant effect is shown below on the left. The twin who scores higher on trait 1 also scores higher on trait 2. This is compatible with a "dose" of trait 1 causing an increase in trait 2. Of course, trait 2 might also be affecting trait 1. Disentangling these two possibilities requires a different design (see below for an example). A null result is incompatible with a causal hypothesis. Take for instance the case of an observed link between depression and exercise (See Figure above on right). People who are depressed also report doing little physical activity. One might
hypothesise that this is a
causal link: that "dosing" patients with exercise would raise their mood and protect against depression. The next figure shows what empirical tests of this hypothesis have found: a null result.
Longitudinal discordance designs As may be seen in the next Figure, this design can be extended to multiple measurements, with consequent increase in the kinds of information that one can learn. This is called a cross-lagged model (multiple traits measured over more than one time). In the longitudinal discordance model, differences between identical twins can be used to take account of relationships among differences across traits at time one (path A), and then examine the distinct hypotheses that increments in trait1 drive subsequent change in that trait in the future (paths B and E), or, importantly, in other traits (paths C & D). In the example, the hypothesis that the observed
correlation where
depressed persons often also
exercise less than average is causal, can be tested. If exercise is protective against depression, then path D should be significant, with a twin who exercises more showing less depression as a consequence.
Assumptions It can be seen from the modeling above, the main assumption of the twin study is that of equal family environments, also known as the
equal environments assumption. A special ability to test this assumption occurs where parents believe their twins to be non-identical when in fact they are genetically identical. Studies of a range of psychological traits indicate that these children remain as concordant as MZ twins raised by parents who treated them as identical. Molecular genetic methods of heritability estimation have tended to produce lower estimates than classical twin studies due to modern SNP arrays not capturing the influence of certain types of variants (e.g., rare variants or repeat polymorphsisms), though some have suggested it is because twin studies overestimate heritability. A 2016 study determined that the assumption that the prenatal environment of twins was equal was largely tenable. Researchers continue to debate whether or not the equal environment assumption is valid.
Measured similarity: A direct test of assumptions in twin designs A particularly powerful technique for testing the twin method was reported by Visscher
et al. Instead of using twins, this group took advantage of the fact that while siblings on average share 50% of their genes, the actual gene-sharing for individual sibling pairs varies around this value, essentially creating a continuum of genetic similarity or "twinness" within families.
Sex differences Genetic factors, including both gene expression and the range of gene × environment interactions, may differ between the sexes. Fraternal opposite sex twin pairs are invaluable in explicating these effects. In an extreme case, a gene may only be expressed in one sex (qualitative sex limitation). More commonly, the effects of particular alleles may depend on the sex of the individual. A gene might cause a change of 100 g in weight in males, but perhaps 150 g in females – a quantitative gene effect. Environments may impact on the ability of genes to express themselves and may do this via sex differences. For instance, genes affecting voting behavior would have no effect in females if females are excluded from the vote. More generally, the logic of sex-difference testing can extend to any defined sub-group of individuals. In cases such as these, the correlation for same and opposite sex DZ twins will differ, betraying the effect of the sex difference. For this reason, it is normal to distinguish three types of fraternal twins. A standard analytic workflow would involve testing for sex-limitation by fitting models to five groups, identical male, identical female, fraternal male, fraternal female, and fraternal opposite sex. Twin modeling thus goes beyond correlation to test causal models involving potential causal variables, such as sex.
Gene × environment interactions Gene effects may often be dependent on the environment. Such interactions are known as
G×E interactions, in which the effects of a gene allele differ across different environments. Simple examples would include situations where a gene multiplies the effect of an environment: perhaps adding 1 inch to height in high nutrient environments, but only half an inch to height in low-nutrient environments. This is seen in different slopes of response to an environment for different genotypes. Often researchers are interested in changes in
heritability under different conditions: In environments where
alleles can drive large phenotypic effects (as above), the relative role of genes will increase, corresponding to higher heritability in these environments. A second effect is
G × E correlation, in which certain alleles tend to accompany certain environments. If a gene causes a parent to enjoy reading, then children inheriting this allele are likely to be raised in households with books due to GE correlation: one or both of their parents has the allele and therefore will accumulate a book collection
and pass on the book-reading allele. Such effects can be tested by measuring the purported environmental correlate (in this case books in the home) directly. Often the role of environment seems maximal very early in life, and decreases rapidly after
compulsory education begins. This is observed for instance in reading as well as intelligence. This is an example of a G*Age effect and allows an examination of both GE correlations due to parental environments (these are broken up with time), and of G*E correlations caused by individuals actively seeking certain environments.
Norms of reaction Studies in plants or in
animal breeding allow the effects of experimentally randomized
genotypes and environment combinations to be measured. By contrast, human studies are typically observational. This may suggest that
norms of reaction cannot be evaluated. As in other fields such as
economics and
epidemiology, several designs have been developed to capitalise on the ability to use differential gene-sharing, repeated exposures, and measured exposure to environments (such as children social status, chaos in the family, availability and quality of education, nutrition, toxins etc.) to combat this confounding of causes. An inherent appeal of the classic twin design is that it begins to untangle these confounds. For example, in identical and fraternal twins shared environment and genetic effects are not confounded, as they are in non-twin familial studies. Twin studies are thus in part motivated by an attempt to take advantage of the random assortment of genes between members of a family to help understand these correlations. While the twin study tells us only how genes and families affect behavior within the observed range of environments, and with the caveat that often genes and environments will covary, this is a considerable advance over the alternative, which is no knowledge of the different roles of genes and environment whatsoever. Twin studies are therefore often used as a method of controlling at least one part of this observed variance: Partitioning, for instance, what might previously have been assumed to be family environment into shared environment and additive genetics using the experiment of fully and partly shared genomes in twins. Association studies, e.g., allow direct study of allelic effects.
Mendelian randomization of alleles also provides opportunities to study the effects of alleles at random with respect to their associated environments and other genes.
Extended twin designs and more complex genetic models The basic or classical twin-design contains only identical and fraternal twins raised in their biological family. This represents only a sub-set of the possible genetic and environmental relationships. It is fair to say, therefore, that the heritability estimates from twin designs represent a first step in understanding the genetics of behavior. The variance partitioning of the twin study into additive genetic, shared, and unshared environment is a first approximation to a complete analysis taking into account
gene–environment covariance and
interaction, as well as other non-additive effects on behavior. The revolution in
molecular genetics has provided more effective tools for describing the genome, and many researchers are pursuing molecular genetics in order to directly assess the influence of
alleles and environments on traits. An initial limitation of the twin design is that it does not afford an opportunity to consider both Shared Environment and Non-additive genetic effects simultaneously. This limit can be addressed by including additional siblings to the design. A second limitation is that gene–environment correlation is not detectable as a distinct effect unless it is added to the model. Addressing this limit requires incorporating adoption models, or children-of-twins designs, to assess family influences uncorrelated with shared genetic effects.
Continuous variables and ordinal variables While concordance studies compare traits either present or absent in each twin,
correlational studies compare the agreement in continuously varying traits across twins. == Criticism ==