January 29, 1998
This report summarizes the findings of a panel of experts who met at the National Institutes of Health on December 10-11, 1997 at a workshop entitled "The Genetic Architecture of Complex Traits." The report and recommendations were prepared for consideration by the National Advisory General Medical Sciences Council.
Most genetic traits of interest in populations of humans and other organisms are determined by many factors, including genetic and environmental components, which interact in often unpredictable ways. For such complex traits, the whole is not only greater than the sum of its parts, it may be different from the sum of its parts. Thus, complex traits have a genetic architecture that consists of all of the genetic and environmental factors that contribute to the trait, as well as their magnitude and their interactions.
The following recommendations are intended to increase the rate of progress and improve the quality of research on the analysis of complex traits.
- NIH and the scientific community should focus on better acquisition of data, including larger samples and more refined phenotype definitions.
- There is no single simple sampling scheme that should be used to obtain the data for these studies.
- Given the reality that genetic architecture can and will vary as a function of population parameters, collection of data on population structure, including histories for different human populations and for model organisms, is essential and should be supported.
- The richness of current methods should be exploited to its fullest through creative applications to appropriate data sets.
- Every effort should be made to encourage the development of fundamentally new models and more sensitive methods of analysis.
- Because so little is known about genetic architecture, exploratory and observational studies should be encouraged.
- NIGMS should encourage collaborative studies among investigators in diverse disciplines.
- With respect to review and funding of research grant applications, it is important to emphasize that no single method or model for studying genetic architecture can be adopted universally.
- The choice of organism and research design should be dictated by the complex trait of interest and the questions being asked.
- Studies of population structure and variability of organisms in natural populations are needed.
- Support is needed for new database structures for population data.
- The development of publicly available genetic data sets of genetic maps and haplotypes should be encouraged.
- The expertise of computational scientists, including physicists, mathematicians, and engineers, is needed; however, most will need to be retrained in statistical genetics.
- NIH should support training in statistical genetics for scientists who intend to apply the tools of genetic analysis.
- Multidisciplinary training is essential.
- The analysis of complex traits does not lend itself to quick and easy solutions.
- At the same time, it is vitally important to communicate to the public and the scientific community the results of studies in accurate terminology.
Most traits that vary in populations of humans and other organisms are determined by multiple factors. Most common diseases with a genetic component are such complex traits. The complexity arises from the fact that each factor contributes, at most, a modest amount to the total variation in the trait observed in the entire population. Complex traits may be continuous in distribution, like height or blood pressure, or they may be dichotomous, like "well" and "affected." Multiple genetic and environmental factors may interact with each other in unpredictable ways. Such unpredictable, nonlinear interactions mean that the expression of the trait may not be anticipated from knowledge of the individual effects of each of the component factors considered alone, no matter how well understood the separate components may be. Thus, the whole is not only greater than the sum of its parts, the whole may be different from the sum of its parts.
The genetic architecture of complex traits consists of a description of all of the genetic and environmental factors that affect the trait, along with the magnitude of their individual effects and the magnitudes of interactions among the factors. It is, in principle, possible to define the genetic components in terms of Mendelian segregation and location along a genetic map. Environmental factors are much less easily partitioned into separate factors whose individual effects and interactions can be sorted out.
It is critical to recognize that the genetic architecture is less a fundamental biological property of the trait than a characteristic of a trait in a particular population. The genetic architecture is a moving target that changes according to gene and genotype frequencies, the distributions of environmental factors, and such biological properties as age and gender.
The dependence on gene frequencies creates some seeming paradoxes of genetic architecture. For example, suppose a trait is completely determined by the interaction of two recessive alleles, one of which is rare and the other common. At the population level, the trait appears to be determined by the rare allele, because its presence limits the variation in the occurrence of the trait among individuals. If the allele frequencies were reversed, the other gene would appear to be the determining genetic cause. But in either instance, both recessive alleles contribute equally to the biological causation of the trait.
The implication of the population dependence is that the predominant genetic factors contributing to a complex trait may seem to differ from population to population. This is probably one reason for the apparent heterogeneity sometimes found in the results of genetic linkage studies in different populations. Insufficient statistical power in the linkage tests is also a possible explanation, and there is always the possibility that superficially identical complex traits in different populations may actually have different biological causes.
The existence of unpredictable, nonlinear interactions between the multiple factors affecting complex traits, as well as possible frequency-dependent differences in genetic architecture from one population to the next, emphasizes one of the principal conclusions of the December 10-11, 1997 workshop, "The Genetic Architecture of Complex Traits." The participants unanimously agreed that understanding the genetic and environmental basis of complex traits is not going to be easy and will not be achieved in a foreseeable time frame. Too little is known about the true nature of the complexity of such traits in any organism.
In an ideal case, when the factors are not numerous, when their main effects are quite large and their interaction effects quite small, and when interpopulation heterogeneity is minimal, very rapid progress can be made. It is by no means clear how widely actual complex traits in humans and other organisms depart from these ideal conditions.
Furthermore, while improved technology can be of tremendous importance, the challenges are not only technological. They are also conceptual (for example, how to identify nonlinear interactions, how to optimize computational algorithms); clinical (improved diagnostic criteria); and epidemiological (how to sample in such a way as to minimize spurious associations due to population structure and population history while maximizing the power to detect biologically significant associations).
Because the genetic architecture is a characteristic of a trait in a population, it is affected by population structure and population history--a fact that undermines the concept of a "disease gene." In a complex trait, there is no "disease gene" in the sense of a Mendelian factor that, by itself, causes a disease. Rather, the genetic and environmental factors underlying a complex trait must each be considered as contributing or predisposing rather than as determinative. Where diseases are concerned, genetic components may be regarded as risk factors.
In spite of these difficulties, the analysis of complex traits is fundamentally important to identifying the contributing genetic and environmental factors of traits and to understanding their underlying biology. The discussion and recommendations from "The Genetic Architecture of Complex Traits" workshop focused on opportunities for progress in four areas--research, resources, training, and communications--some of which can potentially be addressed by NIGMS. Other points will require discussion and action by other groups.
The number of individuals that can be studied will ultimately determine the limit of resolution of analyses of complex traits. The sample sizes that researchers can collect and the quality of individual phenotype assignments (the ability to recognize and correctly assign trait values to individuals) are serious barriers to progress. Given the ongoing efforts to produce a dense, quality map of the human genome of single nucleotide polymorphisms (SNPs), sufficient numbers of markers to analyze the complex traits will be available soon. In human studies, therefore, the limitation will be correctly phenotyped individuals--that is, our ability to correctly diagnose disorders or completely describe traits. The situation is not quite the same for various model organisms where the development of new markers and new maps has lagged behind the effort for humans.
- NIH and the scientific community should focus on better acquisition of data, including larger samples and more refined phenotype definitions. This is especially true for studies of specific common diseases for which large samples of individuals who are assessed for disease according to the same criteria are very difficult to acquire.
- There is no single simple sampling scheme that should be used to obtain the data for these studies. Data-rich does not necessarily mean information-rich. The questions being asked should dictate the analyses of choice, the types of data necessary, sampling design, and the final sample size.
- Given the reality that genetic architecture can and will vary as a function of population parameters, collection of data on population structure, including histories for different human populations and for model organisms, is essential and should be supported. These data must include the definition of normal variation within these populations as well as the recording of disease phenotypes. Accumulation of these population data sets will also lead to studies of evolution within the human population and answers to questions about how these traits came to be and how some diseases achieve sufficient population frequency to be common within our population. Because the genetic architecture differs according to how closely the trait is related to survival and reproductive fitness, evolutionary forces can profoundly affect the genetic architecture of complex traits in humans and other organisms.
The consensus of the participants at the workshop was that both current methods and fundamentally new approaches should be pursued aggressively.
- The richness of the current methods should be exploited to its fullest through creative applications to appropriate data sets. The current armamentarium of methods provides numerous ways to model traits, analyze data, and evaluate results for internal consistency and biological reality. However, just because these techniques and methods will allow us to learn a great deal, we should not hesitate to expand and develop them, to refine the models, and to improve them especially with respect to computational limitations.
- Every effort should be made to encourage the development of fundamentally new models and more sensitive methods of analysis. Human geneticists have been slow to explore and adopt methods developed in other areas of science. For example, refinement of techniques for more rapid computation should be actively pursued. NIH, in its interaction with theoretical investigators and its review process, should encourage creative efforts even if they are not guaranteed to provide improvement.
- Because so little is known about genetic architecture, exploratory and observational studies should be encouraged. Although the majority of NIH-funded grants are hypothesis-driven, science is, in fact, built on observations. We are sufficiently naïve about complex traits that exploratory studies must be supported.
- NIGMS should encourage collaborative studies among investigators in diverse disciplines. Complex traits are not the province of any single discipline. The expertise of molecular biologists, biochemists, clinicians, evolutionary biologists, developmental biologists, mathematical and statistical geneticists, and many others is needed.
- With respect to review and funding of research grant applications, it is important to emphasize that no single method or model for studying genetic architecture can be adopted universally. No single method can answer all, or even some, of the questions without being used in concert with additional approaches. It is a serious error to insist that all studies apply one method, such as association studies for linkage, when the full suite of methods is available to the investigators. In addition, it is inappropriate to abandon the candidate gene approach in favor of general genome searches for those traits where reasonable biological or physiological candidates can be identified. It is also inappropriate to insist on a candidate gene approach when a whole-genome search is justified by sample size and cost-effectiveness. Significant information has accrued through both the exploration of candidate gene regions and the rejection of candidate gene hypotheses.
The overriding advantage of model organisms is the ability to do both genetic and environmental manipulation that cannot be done with human beings. Studies using animal models to explore the genetic architecture of complex traits should be supported in order to identify general principles and pathways and to gain a broad understanding of the biology of complex traits.
- The choice of organism and research design should be dictated by the complex trait of interest and the questions being asked. There is no single, limited set of organisms that is sufficient for these studies. Studies of non-traditional organisms may have much to contribute.
- Studies of population structure and variability of organisms in natural populations are needed. Because the genetic architecture of complex traits varies with context, measuring traits and identifying their variability in the wild is an important piece of the puzzle, and studies to do so have almost ceased to exist.
- Support is needed for new database structures for population data. A great deal of population data already exists, but most public databases, such as GenBank, are inadequate for handling population data, which must include the population source of the allele sequences and the frequency with which they occur. Many data currently exist on individual investigators' computers, but are not accessible to other scientists because there are no good mechanisms for sharing data. The study of population structure would be greatly enhanced by establishing one or more databases to make these data easily and reliably accessible. Pilot efforts might begin by augmenting the Genome Database (GDB) for human data or FlyBase for Drosophila data.
- The development of publicly available genetic data sets of genetic maps and haplotypes should be encouraged. There are numerous NIH-supported studies that individually have low power to detect and map factors that contribute to complex traits. NIGMS should support the establishment of databases that enable data from such studies to be combined. Further, many successful gene mapping studies generate marker data that could be useful to other investigators and that could be made publicly available once the gene of interest is mapped.
- The expertise of computational scientists, including physicists, mathematicians, and engineers, is needed; however, most will need to be retrained in statistical genetics. One of the impediments to such recruitment is the relatively low salary scale of entry-level postdoctoral students in biology.
- NIH should support training in statistical genetics for scientists who intend to apply the tools of genetic analysis. Statistical methodologies must be applied knowledgeably, especially where human data are concerned. Many scientists, including clinicians, molecular geneticists, and others, are eager for basic training in analytical methods so that they can collaborate effectively with their statistical colleagues.
- Multidisciplinary training is essential. Studies of complex traits are inherently multidisciplinary, requiring expertise in genetics, statistics, computational biology, and other areas. In human studies, a strong clinical component is often essential because of the need for careful and correct diagnoses. All too often there are structural or institutional barriers to multidisciplinary collaboration and training. Students must be prepared to cross these disciplinary boundaries if they are to succeed and contribute to future research studies.
The genes involved in complex traits are contributing factors rather than "disease genes." Any one of the genetic factors that contribute to a complex trait may actually account for a relatively small proportion of the total variation in the trait. Furthermore, by itself, the gene may not cause the disease, but rather may be one of many contributing genetic and environmental factors. The danger of oversimplification is to mislead the public into thinking that a disease has been conquered and effective new treatments and therapies are just around the corner. There is a great danger in raising false hopes among the public.
- The analysis of complex traits does not lend itself to quick and easy solutions. We do not yet know the true degree of complexity of complex traits. We hope some will approximate the simplest ideal case and be analyzed rather quickly. We expect others will be much more difficult. It is prudent to make no promises until we understand the nature and extent of complexity in genetic systems.
- At the same time, it is vitally important to communicate the results of studies in accurate terminology. There is no gene for hypertension, depressive disorder, obesity, or any other complex trait. All genes that affect the trait are contributing factors that are more or less important only in relation to other contributing genetic and environmental factors in defined populations. Likewise, terms such as "heritability" are misleading because the common usage differs from the technical definition. In its common usage, "heritability" implies transmissibility, whereas its technical meaning is a ratio of variances. The public, as well as our scientific and medical colleagues, needs current and accurate information in order to understand the issues surrounding the study of complex traits. Scientists should take great care to communicate clearly in order to promote understanding of these difficult and very important issues.
The findings of the participants at "The Genetic Architecture of Complex Traits" workshop do not lend themselves to simplistic answers or quick fixes. The recommendations need to be considered thoughtfully and thoroughly, sometimes in collaboration with a broad spectrum of the scientific community. Success will depend on coordination among institutes and agencies and on increased understanding of the complexities of the scientific questions being asked. The participants note that NIGMS can address many issues; however, some recommendations are trans-NIH. The National Advisory General Medical Sciences Council may wish to consider broader dissemination of the report and recommendations.