February 6, 1998
NIGMS convened a workshop on November 24-25, 1997, to provide the Institute with a perspective on the emergence of new conceptual and experimental approaches to the study of complex processes such as genetic circuitry, metabolic regulation, macromolecular assembly, and other such problems at all levels of biological organization. The workshop participants (roster below) provided examples from their own work and that of others and, in discussion, sought to define the common themes, challenges, and requirements of what was provisionally termed "Biological Systems Analysis (BSA)." The participants emphasized that, although there were differing views on approaches to the analysis of complexity in the context of diverse levels of organization in biological systems, a unifying goal could be identified. They suggested several means to achieve this goal.
The goal is to promote the analysis of the design principles and dynamic behaviors of complex biological systems, with the expectation that such an understanding ultimately will impact the treatment of human disorders and disease. If successful, these design principles and dynamic behaviors will be presented in quantitative formats that readily allow testing by both computer modeling and in vivo approaches.
Currently, there are a number of projects, some of which were presented by participants, that merit inclusion as BSA. However, quantitative, integrative treatments of classical molecular biological, genetic, cell biological, and biochemical data are relatively novel, with few experienced investigators. The participants suggested that adoption of these quantitative approaches would require a variety of supporting initiatives. These fall into the broad categories of cross-disciplinary and collaborative research projects, educational efforts, and infrastructure support.
In September 1997, members of the National Advisory General Medical Sciences (NAGMS) Council were informed that staff intended to convene an informal workshop on analyzing complex biological systems. This initiative reflects a growing sense among some researchers that investigators are encountering significant new challenges that may go beyond traditional--and even very recently developed--biomedical research approaches. Thus, to continue making progress, investigators may well need fundamentally new strategies, approaches, and tools to identify and understand the design principles and dynamics of complex biological processes.
The workshop convened on November 24-25, 1997, at the National Institutes of Health in Bethesda, MD. Participants included researchers with specialties such as genetics, biochemistry, physiology, the neurosciences, and medicine. Some participants were trained in non-biological disciplines such as mathematics, physics, and engineering, with experience in the analysis of complex systems. One workshop participant, Dr. Susan Henry, who is an NAGMS Council member, was asked to deliver the workshop's recommendations to other members of the Council during its January 1998 meeting.
Defining the Challenge
During the past decade, biomedical researchers have been amassing an enormous volume of valuable data across a wide spectrum of the biological world. This information ranges from detailed molecular descriptions of multicomponent protein systems, including important enzyme-substrate and receptor-ligand complexes, to genomic DNA sequence information for more than a dozen microorganisms as well as extensive DNA sequence information for other microorganisms, plants, and animals. On another level, biologists are also learning a great deal about essential subcellular structures, such as the mitotic apparatus for separating chromosomes and organelles that are responsible for cellular locomotion, and about the way genetically specified programs operate during differentiation and development of specialized cells, tissues, and organs.
Nearly all these efforts reflect a reductive, analytic approach to investigating important biological questions. Typically this approach entails careful, often intensive experiment-based scrutiny of a very limited number of components in a biological system, model building and hypothesis development based on those empiric observations, and further experimenting to test elements of those hypotheses.
Reflecting the value in following this approach, biomedical researchers from a range of disciplines typically have deliberately restricted their analyses to well-defined systems with relatively few components. However, more recently, that expressly narrow approach is being complemented by broader, more comprehensive efforts. In particular, expanded programs to use genomic DNA sequences to identify genes and their regulatory sequences, and predict the structure and even the function of their coded proteins, will provide a phenomenal volume of valuable new data about an organism's entire genetic complement.
Equally important, however, these efforts to analyze genomic DNA sequences as well as other large-scale analytical efforts pose an immense challenge to those trying to understand fully what this information means for biology. Thus, useful though it may be, a complete listing of an organism's genetic and structural components is not adequate to describe or explain, much less predict the behavior of, that organism's many complex and varied functions. Many of these functions may result from stochastic rather than fully programmed interactions of genetically specified products. The behavior of the whole may not be inferable from the collective description of individual parts. Much is to be learned regarding how seemingly unrelated molecular events can influence the development of a complex phenotype.
Such realizations prompt a series of challenging questions for biologists to address. Those questions revolve around fundamental issues of how they conduct their scientific investigations and analyze information. For instance, within such comprehensive data sets, which details are essential and what others may safely be disregarded? More important, are there principles to be discovered that will help investigators describe emergent biological properties as they analyze such data sets? If so, how can they begin to identify and then effectively deploy those principles?
Other broad questions surfaced in discussion. For example, can investigative teams, with members drawn from different disciplines, begin to develop "hybrid" approaches to studying complex problems--perhaps by combining traditional bottom-up analysis with reverse engineering strategies? Is a new "integrated" and "reiterative"--rather than strictly reductive--approach now needed for studying biological systems?
Such questions suggest some special pragmatic needs, and a new initiative. In addition to supporting the development of interdisciplinary research projects, perhaps the most important need will be to develop investigators who can deal with the inherently multidisciplinary nature of such research. Meeting these training needs may well be as challenging culturally as it will be intellectually. In addition, the need for specialized instruments is anticipated, as is the need for computer software systems that are capable of integrating large volumes of seemingly unrelated data.
Despite the risk that descriptions of specific biological examples might limit the scope of the anticipated NIGMS initiative, workshop participants found these descriptions helpful for defining the initiative's framework. These examples, which are drawn from research on both prokaryotes and eukaryotes, provide a concrete sense of what investigators mean when they refer to complexity in biological systems, but they are not meant to constrain the boundaries of the anticipated initiative.
Consider enteric bacteria such as Escherichia coli and Salmonella typhimurium, both of which have been intensively studied for several decades. Indeed, a great deal of detailed information is available to describe their respective biochemistries, genetics, and physiologies. For example, in 1997, the E. coli genomic sequence was completed and published. Nonetheless, great gaps remain in understanding the behavior of these bacteria, with some of those gaps reflecting phenomena that seem to reach beyond a common-sense understanding of their genetic or physiologic functions.
For instance, fully 1 percent of the S. typhimurium genome is dedicated to genes specifying proteins needed to synthesize vitamin B12. Yet, if those genes are deleted, the mutant cells exhibit no obvious phenotype when grown in culture. Do these genes specify some other function that is needed when S. typhimurium is growing in a more natural setting? And why does this bacterium carry these genes when its close relative E. coli does not? These questions lead to a more fundamental question: What evolutionary strategy underlies the features that distinguish one closely related bacterial species from another?
Microorganisms offer many other examples of biological complexity. Despite decades of intensive study, investigators are far from understanding the transition in Bacillus subtilis from vegetative growth to spore formation. Because this transition seems to involve cellular responses to environmental signals, and not all cells within a seemingly uniform population go through it, knowledge of the bacterium's genomic sequence is unlikely to provide an explanation for how this transition process is initiated. Some broader overview of the regulatory circuitry at work in such cells seems a necessary prerequisite for understanding this and other similarly complex biological processes.
Investigators studying bacterial chemotaxis also are faced with the challenge of understanding how living cells transduce and respond to environmental signals. Although a great deal is known about the genetically determined biochemical apparatus that enables bacterial cells to move up or down a chemical gradient, much is yet to be discovered about how the regulatory process functions to produce an appropriate response to information in the environment.
Despite the availability of the genomic sequence of the yeast, Saccharomyces cerevisiae, that comprehensive DNA sequence information seems not to explain several genetic and metabolic phenomena peculiar to living yeast cells. For instance, yeast lipid metabolism follows distinctive patterns during different phases of cell growth. However, although the general category of end-lipid product seems to be under genetic control, the overall process is also affected by other, more subtle factors, including catabolite repression, conformational changes of proteins, and the physical state of the plasma membrane. How is this information integrated and processed to determine the outcomes of lipid metabolism?
Another element of biological complexity is what some researchers are calling "not-strong" genetic effects--a term that seems to apply to phenomena associated with mating cell signal transduction in yeast. This process involves a complex cascade of biochemical changes among small signal molecules and kinase proteins, whose overall control may reflect subtle interactions between cell types. Because mutant selection methods are typically biased toward components that have strong biological effects, these other more subtle effects usually are overlooked and remain difficult to analyze. However, particularly in the context of interacting network systems within cells, these putative not-strong effects may be essential for fine-tuning those systems.
How can investigators identify and study such phenomena? One promising experimental approach entails producing large arrays of microbial cell colonies, each containing a different mutation as well as a fluorescent marker, and then subjecting those arrays systematically to different physical and chemical perturbations. However, such experiments generate voluminous data sets that are proving challenging to analyze in themselves.
Nonetheless, several workshop participants independently recommended this general approach--namely, of subjecting some biological phenomenon to exhaustive testing in many different environments and under many different conditions. This approach provides a way of examining the "robustness" of the regulation of physiological processes, according to some investigators whose focus is on microorganisms. Moreover, according to others who are working with complex mammalian systems, such as the genes expressed in the embryonic spinal cord, a similar exhaustive approach may furnish insights into multigenic processes during embryonic and early postnatal development.
Differentiation and development certainly are among the biological processes that investigators deem complex and, for now, elusive. Here again, although genetic studies provide essential insights, they apparently do not tell the complete story. For example, a genetically specified structure that is part of the sexual apparatus in the roundworm, Caenorhabditis elegans, gives rise to part of the visual system in the fruit fly, Drosophila melanogaster. In another instance, gene dosage and slight differences among proteins in a multicomponent complex apparently determine whether an individual C. elegans will be female or hermaphroditic. What accounts for these different outcomes?
Many investigators are now producing "knockout" mutants in organisms ranging from bacteria to mice as a way of studying complex, genetically based behaviors. Yet, despite detailed knowledge about the functions of the targeted genes in such knockout mutants, often the resulting phenotype deviates from the one anticipated. Typically, seemingly redundant genes with overlapping functions help explain what happens, raising another fundamental question. Why is there so much apparent "redundancy" among genes?
Moving to a clinical setting, multi-organ failure provides an important example of a complex, poorly understood biological phenomenon that often proves deadly and, even when it can be successfully countered, is very expensive to treat. In this clinical situation, several vital organs begin to move away from healthy homeostasis near or at the same time, and toward a state of severe dysfunction that brings death. Typically, although each organ system is treated separately to try to reverse its dysfunction, negative synergy often occurs among several organ systems, meaning that even heroic efforts to treat one deteriorating system may not prevent the others from entering a downward spiral. This life-threatening clinical phenomenon poses a difficult challenge for investigators seeking to better understand and treat patients who develop this syndrome.
Obstacles and Opportunities
During the past several decades, biomedical investigators have used ever more sophisticated research tools to identify and analyze the functions of the components that make up living cells. Despite many successes, they often have met with frustration when they have tried to describe phenomena that embody biological complexity--that is, functions that map across several organizational dimensions. For example, although understanding a monogenic disorder may prove to be relatively straightforward, understanding a multigenic disorder that affects several potentially interacting biochemical pathways and physiological processes usually does not. A major source of this frustration is the absence of a common language and of compatible (and accessible) data systems for much of the analysis that is needed. To be sure, there is a standardized naming convention for enzymes, and DNA sequence data sets are relatively easily manipulated. However, gene and gene product nomenclature tends to be idiosyncratic at best, making it difficult to compare potentially common structures between any but the most closely related organisms. Indeed, analysis of homologous structures within different organs of a single species, such as a particular molecular apparatus used during development and morphogenesis in the fruit fly, has been hampered because of non-standard nomenclature.
An NIGMS initiative will likely focus on fundamental issues of biological complexity, including questions of complex multigene and gene product interactions, membrane signal transduction and responses to subtle environmental factors, and of differentiation and development in model systems. There are, however, opportunities at higher levels of organization, in the clinical setting. In the case of multi-organ dysfunction, and other areas of NIGMS concern such as anesthesiology, pharmacology, and burn research, quantitative insights into the function of complex systems in humans could help to improve health, save lives, and reduce the costs of medical care.
Support of Cross-Disciplinary Research and Collaborative Research Projects
Opportunities exist for the integration of traditional biological approaches with those of physics, mathematics, chemistry, engineering, and computer sciences. These opportunities have resulted, in part, from technological advances that are amplifying our ability to acquire massive and comprehensive datasets of unprecedented resolution in time and space. Furthermore, these opportunities can be found across the scope of science that NIGMS supports, from basic studies with model organisms to clinical studies that impact directly upon the management of human disease. Comparative studies across diverse systems may yield unifying patterns of biological organization and dynamics. The participants therefore encourage NIGMS to announce a program of support for cross-disciplinary and collaborative research projects that will enable us to understand, represent, and predict the behavior of complex biological phenomena.
Support of Educational Efforts
The participants recognized that there are significant educational barriers to the realization of these new research opportunities. One classic barrier is the lack of communication between theorists and their more empirical colleagues within the biomedical community; another is the traditional structure of academic departments, which may not be supportive of cross-disciplinary appointments and formal partnerships. In order for the value of BSA to be appreciated within the traditional biomedical research community, the participants recommend that NIGMS support further workshops and scientific meetings that will publicize the promise and accomplishments of BSA.
A major barrier is the shortage of biomedical scientists who also have the quantitative and computational expertise that must be brought to bear on these research areas. The workshop participants recommended several remedies to address this shortage.
First, physicists, mathematicians, engineers, computer scientists, and other experts with quantitative skills relevant to the analysis of complex processes and complex genetic traits should be encouraged to collaborate with biomedical scientists. One example is the NIGMS initiative (PA-98-024) to provide supplements to the Institute's grants that will allow scientists with these backgrounds to conduct collaborative research projects intended both to develop new approaches to the study of complex systems as well as to provide experience for non-biologists in working with biological systems.
NIGMS is encouraged to solicit applications for institutional and individual fellowship applications that will provide relevant cross-disciplinary instruction and research experiences at the pre- and postdoctoral levels. The current NIGMS research training program in Systems and Integrative Biology would be appropriate for institutional training grants at the predoctoral level. Postdoctoral training that emphasizes cross-disciplinary experiences should likewise be encouraged. While the existing supplements program, referenced above, can provide some support, the NIH Individual National Research Service Awards (fellowships) are an appropriate mechanism. All avenues should be encouraged, perhaps through a program announcement.
Support of Resources and Infrastructure
Technological advances and access to data have been key to opening up opportunities for BSA. However, there are significant barriers to acquiring access to expensive instrumentation and to the development of new software and critical databases. Also, access to, and maintenance of, existing databases currently can be a problem. NIGMS should anticipate that these needs will continue to accelerate as more high-throughput, systems-wide approaches are invented and are accepted as routine research tools. In order for the biomedical community as a whole to share in these developments, sources of funding and creative approaches to access will be required.
Simon, Melvin I., Ph.D.
Division of Biology
California Institute of Technology
Pasadena, CA 91125
Bar-Yam, Yaneer, Ph.D.
New England Complex Systems Institute
17 Cedar Street
Newton, MA 02159
Buchman, Timothy G., M.D., Ph.D.
Professor of Surgery and Anesthesiology
Washington School of Medicine
St. Louis, MO 63110
Chua, Nam-Hai, Ph.D.
Professor and Head
Laboratory of Plant Molecular Biology
The Rockefeller University
New York, NY 10021-6399
Gelbart, William W., Ph.D.
Department of Molecular and Cellular Biology
Cambridge, MA 02138
Hartwell, Leland H., Ph.D.
President and Director
Fred Hutchinson Cancer Research Center
Seattle, Washington 98109-1024
Henry, Susan A., Ph.D
Professor of Biological Sciences
Dean of the Mellon College of Science
Pittsburgh, PA 15217
Leibler, Stanislas, Ph.D.
Department of Physics
Princeton, NJ 08544
McAdams, Harley, Ph.D.
724 Esplanada Way
Stanford, CA 94305
Meyer, Barbara J., Ph.D.
Investigator, Howard Hughes Medical Institute
Professor and Head, Division of Genetics
Department of Molecular and Cell Biology
University of California, Berkeley
Berkeley, CA 94720-3204
Rine, Jasper, Ph.D.
Professor of Genetics
University of California, Berkeley
Berkeley, CA 94720
Roth, John, Ph.D.
Professor of Biology
Department of Biology
University of Utah
Salt Lake City, UT 84124
Savageau, Michael A., Ph.D.
Professor and Chair
Department of Microbiology/Immunology
University of Michigan Medical School
Ann Arbor, MI 48104
Somogyi, Roland, Ph.D.
National Institute of Neurological Disease and Stroke
National Institutes of Health
Bethesda, MD 20892
Taylor, Lansing D., Ph.D.
Center for Light Microscope Imaging and Biotechnology
Pittsburgh, PA 15213