PSI Pilot Phase Fact Sheet

Crystal structure of a protein with unknown function from Pseudomonas aeruginosa, a disease-causing bacterium. 

Piloting high-throughput structure determination

The Protein Structure Initiative (PSI), which ended in July 2015, was a national effort to assemble a large collection of protein structures in a high-throughput operation. Knowledge gained could help researchers better understand the function of proteins, learn how altered structures can contribute to disease and identify new targets for drug development.

Facts at a Glance

Goal: To develop new approaches and tools needed to streamline and automate the steps of protein structure determination, and to incorporate those methods into high-throughput pipelines that use DNA sequence information to generate three-dimensional protein structure models

Project period: September 2000 to June 2005

Funding: $270 million (funded largely by the National Institute of General Medical Sciences, with additional support from the National Institute of Allergy and Infectious Diseases)

Number of Centers: 9

Solved protein structures: More than 1,100

Unique structures solved (structures sharing less than 30 percent of their sequence with other known proteins): More than 700

Selected Technical Advances

  • The “Sesame” Laboratory Information Management System allows users to enter, process, view and extract relevant data from any location using a series of Web-base applications.
  • Auto-induction protocols allow automatic induction of bacterial protein production. These protocols produce 10 times more protein per volume of culture than traditional methods.
  • Systems based on fusions of a target protein with a fluorescent tag have been developed to evaluate whether the target has folded properly when expressed in cells or in vitro and for determining whether the target is present in soluble form. These methods can be used to engineer proteins to improve their folding and solubility.
  • A fully integrated robotic crystallization system can set up a 96-well plate every 2 minutes, giving it a maximum throughput of 2,880 different crystallization experiments per hour.
  • Storage and crystal imaging units can quickly image 96 wells of a crystallization plate.
  • Small-volume crystallization chips, now used widely by crystallographers, are used to screen conditions efficiently and speed crystal growth.
  • Incorporation of a wheat germ cell-free expression system holds the promise of increasing the production of proteins from higher organisms.
  • Automated software for X-ray crystallographic structure determination can carry out fully automated determination of three-dimensional protein structures from X-ray diffraction data.
  • Automatic crystal mounting and crystal screening robots use computational processes to automatically screen crystals for quality or for contiguous collection of multiple data sets.
  • The interaction between various pieces of lab equipment, a bar code writer and a personal digital assistant through a wireless computer network allows for inexpensive, small-scale automation of a lab environment and can replace the old-fashioned laboratory notebook.
  • Automated NMR data analysis is a fully integrated data analysis platform that pulls together the complete process of protein NMR structure determination and analysis, as well as archiving raw NMR data and intermediate results.
  • Automated post-structure functional analysis software is used to search a three-dimensional structure against databases of three-dimensional structural templates and identify functionally important motifs. A Web interface for the software has been designed, and it can be used to obtain a summary of a protein’s most likely function.

Pilot Centers

Additional information about the pilot centers.

  • Berkeley Structural Genomics Center focused on two bacterial species with extremely small genomes to study proteins essential for independent life.
    Principal investigator: Sung-Hou Kim, Lawrence Berkeley National Laboratory
  • Center for Eukaryotic Structural Genomics, based in Wisconsin, focused on protein production, characterization and structure determination from Arabidopsis thaliana, a plant that is frequently used in laboratory research and that has many genes in common with humans and animals.
    Principal investigator: John Markley, University of Wisconsin, Madison
  • Joint Center for Structural Genomics, based in California, focused on novel structures from thermophilic microorganisms and on human proteins thought to be involved in cell signaling.
    Principal investigator: Ian Wilson, The Scripps Research Institute
  • Midwest Center for Structural Genomics, based in Illinois, selected bacterial targets related to disease and proteins from all three kingdoms of life. The emphasis was on previously unknown folds and on proteins from disease-causing organisms.
    Principal investigator: Andrzej Joachimiak, Argonne National Laboratory
  • New York Structural Genomics Research Consortium solved protein structures for disease-related proteins from eukaryotes and bacteria.
    Principal investigator: Stephen K. Burley, Structural GenomiX, Inc.
  • Northeast Structural Genomics Consortium, based in New Jersey, focused on target proteins from various model organisms, including the fruit fly, yeast and roundworm. It used both X-ray crystallography and NMR spectroscopy.
    Principal investigator: Gaetano Montelione, Rutgers University
  • The Southeast Collaboratory for Structural Genomics, based in Georgia, determined structures from the prokaryotic model organism, Pyrococcus furiosus, and the eukaryotic model organism C. elegans, as well as some human proteins.
    Principal investigator: Bi-Cheng Wang, University of Georgia
  • Structural Genomics of Pathogenic Protozoa Consortium, based in Washington, solved protein structures from organisms known as protozoans, many species of which cause deadly diseases such as sleeping sickness, malaria and Chagas' disease.
    Principal investigator: Wim G. J. Hol, University of Washington
  • TB Structural Genomics Consortium based in New Mexico, analyzed protein structures from Mycobacterium tuberculosis.
    Principal investigator: Thomas Terwilliger, Los Alamos National Laboratory


NIGMS supports basic biomedical research that is the foundation for advances in the diagnosis, treatment and prevention of disease. NIGMS is part of the National Institutes of Health, U.S. Department of Health and Human Services. To learn more about NIGMS, visit

Content revised July 2015