Twin study designs – Twins Research Australia

The appropriate study design and analytic method always depends on the specific research question and aim. The statistical model used depends on whether the outcome is continuous, binary or other (e.g.censored survival time, ordinal), and also (although to a lesser extent) on whether the exposure is continuous or binary. Although the exact study design and analytic approach may be unique to every study, here are some general classes of study designs involving twins and some statistical guidelines.

Things to keep in mind

More complex statistical methods are not always better (the simple paired t-test can be very useful in twin studies.)
General statistical principles still apply when analysing data from twins and families:
- Explore your data thoroughly first
- Be aware of model assumptions and test these whenever possible (e.g., normality, linearity and equal environments)
- Provide estimates, 95% CIs and p-values
- Start with simple analyses and models, and build on these
- Adjust for measured variables before considering unmeasured effects
- Analyses of continuous outcomes are usually more powerful than those of binary outcomes
Detailed advice should always be sought from a statistician if you are unsure.

Articles outlining the benefit of using twin designs can be downloaded below:

Broad classes of twin study designs & statistical packages:

Classic twin study

The classic twin design aims to quantify the roles of genetic and environmental causes of variation in traits and in disease susceptibility.

Estimate correlations rMZ and rDZ
Compare MZ correlation with DZ correlation
Divide total residual variance into components due to:
A = (additive) effects of genes
C = environmental (i.e., non-genetic) factors that are shared by twins in the same pair
E = environmental effects specific to a person
σ2 = A + C + E

In 1918,in his mid-20s, a twin called R. A. Fisher famously showed how the correlation between relatives (r) relates to A, C and E:
rMZ = A + C
rDZ = 0.5 A + C

Heritability = % of variation explained by genes
H = A / (A + C + E)
H = 2(rMZ – rDZ), provided H < rMZ
This equation assumes that MZ and DZ pairs share – to exactly the same extent – the non-genetic (environmental) factors specific to the characteristic of interest (C).
If rMZ > rDZ , then genetics might play a role.

Analytic approaches

Pearson correlation (a good start but not ideal – the Intraclass Correlation Coefficient is better)
Extensions of linear regression models:
 Variance components models
 Structural equation models
 Biometric models
 Mixed effects models
 Multivariate analyses

Advantages (not just heritability!)

Very flexible models
Adjust for exposures and confounders within families
Variation perhaps more important than correlation
Assess age and sex effects on variance and covariance

Limitations of classic twin approach

Equal environments
Crucial model assumption
Can be difficult to test
Low power to detect C effects
ANY excess MZ correlation attributed to genetic effects
Focus on h2 – other potentially interesting results ignored
For non-normal outcomes, especially binary traits:
– Lower power
– More difficult to interpret results

References

Fisher, R. (1918). The Correlation Between Relatives On The Supposition Of Mendelian Inheritance. Transactions of the Royal Society of Edinburgh, 52, 399-433.
Hopper, J. L. (2005). Genetic Correlations and Covariances. Encyclopedia of Biostatistics.
Boyd, N. F., Dite, G. S., Stone, J., Gunasekara, A., English, D. R., McCredie M. R. E, Giles, G., Tritchler, D., Chiarelli, A., , Yaffe, M. J. and Hopper, J. L. (2002). The New England Journal of Medicine, 347(12), 886-94.

Case-control study

When working with twins discordant for disease, a matched case–control study can be applied.

Twins are matched for:

Age
Genetic factors (perfectly for MZ pairs; 50% for DZ)
Non-genetic familial factors (not necessarily to the same degree for MZ and DZ pairs)
Mother, father, uterus and, perhaps, placenta
Sex, if same-sex pairs
Calendar year of birth
Measured factors for which they are the same or similar

Analytic methods

Binary outcome
Conditional logistic regression models
These methods adjust for measured factors

Strengths

Matched for both measured and unmeasured factors
Otherwise very similar to standard case–control studies
Less costly and time-consuming than cohort studies

Limitations

Potential recall bias
Inefficient for rare exposures

References

Cockburn, M., Black, W., McKelvey, W. and Mack, T. (2001). Determinants Of Melanoma In A Case-Control Study Of Twins (United States). Cancer causes & control, 12(7), 615-25.
Hamilton, A. S., & Mack, T.M. (2003). Puberty and genetic susceptibility to breast cancer in a case-control study in twins. The New England Journal of Medicine, 348(23), 2313-22.
Oliveira, V. C., Ferreira, M. L., Refshauge, K.M., Maher, C.G., Griffin, A.R., Hopper, J. L. and Ferreira, P. H. (2015). Risk factors for low back pain: insights from a novel case-control twin study. The Spine Journal, 15(1), 50-7.

Co-twin control study

This design involves twins discordant for specific environmental factors or exposures, and twins discordant for disease outcomes or measures of morbidity.
Select twin pairs who differ (the most) in exposure

including measured genes (if DZ)
epigenetic changes (especially if MZ)

Analytic approach

Analyse differences in outcome against differences in exposure
Within- and between-pair models (Carlin et al., 2005; Gurrin et al., 2006)
Conditional logistic regression

Strengths

Matched for both measured and unmeasured factors
Potential for causal inference
Similar to matched cohort studies:
Advantageous for rare exposures
Might need to cast a wide net
Of 1,300 female twin pairs in our Health and Lifestyle Questionnaire:
- The average difference in pack years of smoking was 0.5 years
- The average difference in mental health score was 0.08

Limitations

Representative?

References

Carlin, J. B., Gurrin, L. C., Sterne, J. A. C., Morley, R. & Dwyer, T. (2005). Regression Models For Twin Studies: A Critical Review. International Journal of Epidemiology, 34, 1089-1099
Hopper, J. L., & Seeman. (1994). The Bone Density Of Female Twins Discordant For Tobacco Use. The New England Journal of Medicine, 330(6), 387-92.
Goldberg, J., & Fischer, M. (2005). Co-twin Control Methods. Encyclopedia of Statistics in Behavioral Science.
Gurrin, L. C., Carlin, J. B., Sterne, J. A. C., Dite, G. S. and Hopper, J. L. (2006). Using Bivariate Models to Understand between- and within-Cluster Regression Coefficients, with Application to Twin Data. Biometrics, 62, 745–751.
Scurrah, K. J., Kavanagh, A. M., Bentley, R. J., Thornton, L. E. and Harrap, S.B. (2015). Socioeconomic position in young adulthood is associated with BMI in Australian families. Journal of Public Health, 38(2), e39-e46.

Intervention study

This design randomly assigns twins within a pair, as a pair or randomly, matching for age, sex and genetic susceptibility

Cross-over design for balance
Under-used design to date

Examples

Response to exercise study
(Green, Marsh et al., The University of Western Australia)
a. Is response to exercise heritable?
b. Does response to exercise depend on type of exercise?
Back pain and insomnia study
(Ferreira et al., The University of Sydney)
a. Does a specific web-based sleep intervention also improve back pain?
b. Twins in each pair randomised to opposite arms (one to placebo, one to treatment)
c. Primary outcome is activity limitation and functional outcome (measured by Patient-specific Functional Scale)
d. The Actiwatch will be used to assess participants’ sleep disturbance

Analytic approach

Extensions of previous methods
Variance components modelling
Linear and logistic regression with adjustment for correlation
Systematic review of current approaches:
(Yelland et. al. 2015)
Work in progress (Sumathipala et al. 2016)
Power, sample size etc: Work in progress (Yelland et al. 2016)

Strengths

Matching for genes could be critical
Participation enhanced by pairing
Motivated group

Limitations

Sharing – twins potential failure to adhere to protocol (including swapping devices).
– “Twins will be asked not to discuss with their co-twins about the intervention they are receiving.”

References

Yelland, L. N., Sullivan, T. R. and Makrides, M. (2015). Accounting For Multiple Births In Randomised Trials: A Systematic Review. Archives Of Disease In Childhood. Fetal And Neonatal Edition, (100) F116–F120.

Longitudinal study

TRA keeps contact with twins, enabling prospective longitudinal studies
Twins were studied when they were adolescents and followed into adulthood
Response lower when they were in the 20s, but increased when they moved into their 30s
Qualitative study of uptake of, and committed, smoking using ~20 discordant pairs
Twins’ natural life histories are gold for research

Analytic approach

Multilevel models
- Observations on individuals within twin pairs
- Extension of variance components models

Trajectory model
- Estimate growth curve and assess association of classes of curve with outcome

References

Hopper, J. L., Foley, D. L., White, P. A. and Pollaers, V. (2013). Australian Twin Registry: 30 Years Of Progress, Twin Research and Human Genetics, 16, 34-42.
Gatz, M., Harris, J. R., Kaprio, J., McGue, M., Smith, N. L., Snieder, H., Spiro, A. and Butler, D. A. (2015). Cohort Profile: The National Academy of Sciences-National Research Council Twin Registry (NAS-NRC Twin Registry), International Journal of Epidemiology, 44(3), 819-25.
Grantz, K. L., Grewal, J., Albert, P. S., Wapner, R., D’Alton, M. E., Sciscione, A., Grobman, W. A., Wing, D. A., Owen, J., Newman, R. B., Chien, E. K., Gore-Langton, R. E., Kim, S., Zhang, C., Buck Louis, G. M. and Hediger, M. L.(2016). Dichorionic Twin Trajectories: The NICHD Fetal Growth Studies, American Journal of Obstetrics andGynecology, 215, 221.e1-.e16.
Loke, Y. J., Novakovic, B., Ollikainen, M., Wallace, E. M., Umstad, M. P., Permezel, M., Morley, R., Ponsonby, A. L., Gordon, L., Galati, J. C., Saffery, R. and Craig, J. M.. (2013). The Peri/postnatal Epigenetic Twins Study (PETS), Twin Research and Human Genetics, 16(1), 13-20.
Moayyeri, A., Hart, D. J., Snieder, H., Hammond, C. J., Spector, T. D. and Steves, C. J. (2016). Ageing Trajectories in Different Body Systems Share Common Environmental Etiology: The Health Aging Twin Study (HATS), Twin Research and Human Genetics, 19, 27-34.

Statistical packages

Many twin models can be fitted using standard statistical software such as R, Stata and SAS, however you may also wish to consider the following for your analysis:

Add-ons specifically for family and twin analyses
mestreg (Stata)
Multivariate Event Times (mets) (R)
kinship2 (R)

Software specifically for family and twin analyses
Fisher¹ ²
Mendel³

General multilevel modelling software and add-ons
gllamm
MLwiN
PROC MIXED
Linear Mixed-Effects Models (lme)
Linear and Nonlinear Mixed Effects Models (nlme)
Stata mixed

Structural equation modelling
OpenMx
Mplus

1. Hopper, J.L. & Mathews, J. A. (1994). Multivariate Normal Model For Pedigree And Longitudinal Data And The Software ‘Fisher. Australian & New Zealand Journal of Statistics, 36(2): 153-76.
2. Lange, K., Weeks, D., & Boehnke, M. (1988). Programs For Pedigree Analysis: MENDEL, FISHER and dGene. Genetic Epidemiology, 5: 471-2.
3. Lange, K., Papp, J.C., Sinsheimer, J.S., Sripracha, R., Zhou, H., & Sobel, E.M. (2013). Mendel: The Swiss Army Knife Of Genetic Analysis Programs. Bioinformatics, 29 (12):1568-70.

Broad classes of twin study designs & statistical packages:

Analytic approaches

Advantages (not just heritability!)

Limitations of classic twin approach

References

Analytic methods

Strengths

Limitations

References

Analytic approach

Strengths

Limitations

References

Examples

Analytic approach

Strengths

Limitations

References

Analytic approach

References

Contact us