Joint Meeting of the Research Center Directors and the Protein Structure Initiative Advisory Committee (PSIAC)

PSI Annual Meeting

National Institute of General Medical Sciences
National Institutes of Health

December 7-8, 2006

The Protein Structure Initiative is now in the second year of its second phase. Four large-scale centers have been established to focus on the acquisition of a large collection of protein structures and the continued improvement of structure-determination pipelines. Six specialized centers have also been selected to tackle problem areas that currently limit high-throughput structure determination of "difficult" proteins. The directors (Principal Investigators) of all the centers assembled at the annual meeting on December 7, 2006, to review accomplishments to date and outline plans for the future. The PSI Advisory Committee met the following day in closed session to evaluate the reports and to discuss plans for the future.

Large-scale Research Centers

Progress in all four of the large-scale centers, reviewed by Principal Investigators Stephen Burley, Andrzej Joachimiak, Guy Montelione and Ian Wilson continues to be highly impressive. This bodes well for the overall success of PSI-2. Each of the large-scale centers is currently depositing in the Protein Data Bank (PDB) approximately 130-170 new structures per year. These structures are being selected, insofar as possible, to be from larger protein families for which no present structure is known. This should increase the benefit to the broader community of the structures solved and also increase the likelihood that representative structures will be available from protein families of medical interest. The structures being determined are representative in size and complexity of protein structures in general. For example, one of the centers reported that the average size of the structures they have determined is 49 kDa and that the number of amino acids per polypeptide chain ranges from 97 to 867. Of these structures, 79% were "unique" in the sense that no close relative (more than 30% sequence identity) was in the PDB at the time that the structure was deposited. A number of examples were given of proteins which have been determined through the PSI and have intrinsic biological and health relevance.

Specialized Centers

The directors of the specialized centers, Lance Stewart, John Markely, George DeTitta, Robert Stroud, Thomas Terwilliger and Wayne Hendrickson, gave brief reports on the progress that has been made in their respective centers. At this stage it is still too early to assess the likely impact from these initiatives. Hopefully a better impression of the progress that is being made will be obtained next spring (March-May, 2007) when it is planned that members of the PSIAC will make site visits to both the large-scale and specialized centers.

"Ancillary" Benefits of the Protein Structure Initiative

The stated objective of the Protein Structure Initiative is "to make the three-dimensional atomic level structures of most proteins easily available from knowledge of their corresponding DNA sequences". In order to achieve this the PSI is in the process of determining a library of broadly-representative three-dimensional structures of proteins. It is becoming increasingly apparent, however, that the initiative will also provide technological, material and informational benefits to the structural and biological community at large. One such benefit which was discussed by a number of the speakers is the value of "salvage approaches" in studying proteins of interest. If one simply chooses a given protein of interest, and tries to clone, express, purify, crystallize and determine its structure, PSI experience shows that the likelihood of success is perhaps 3% for eukaryotic proteins and up to 15% for prokaryotic ones. This low success rate is a problem not only for the Protein Structure Initiative but for investigators in general. There are, in principle, a number of approaches which can be tried to improve the overall likelihood of success. These include, among many others, (1) the use of different expression vectors and strains, (2) the use of proteolytic digestion to identify stable folded regions of the protein, (3) the use of engineered truncations to achieve the same end, (4) the use of reductive methylation to improve solubility and crystallizability, and (5) the use of engineered surface charge-change substitutions to achieve the same result. By systematic "data mining" of experience from the PSI it is becoming possible to anticipate the likely effectiveness and cost-benefit of these different approaches. Use of surface mutations, for example, appears to be quite promising in crystallizing recalcitrant proteins. Reductive methylation, on the other hand, appears to have a lower success rate (about 7%) but can be tested at minimal cost and also can be fully automated. The use of pressure-induced folding and refolding has been suggested as a way to increase the tractability of insoluble proteins but has been evaluated and rejected by at least one of the PSI groups. The use of small-molecule ligands to systematically improve crystallization is currently being tested. As the results of these large-scale, controlled studies become available they will be of enormous benefit not only to the PSI but to the overall community of molecular biologists and bench scientists.

Another benefit to the community at large will come from the PSI Materials Repository which is being set up at Harvard under the directorship of Joshua LaBaer. As described by Dr. LaBaer the repository is highly automated and will make clones and other materials from the PSI immediately available to NIH-supported and other investigators. As this repository expands in scope its benefit will grow in proportion.

Modeling center directors (Adam Godzik and Roland Dunbrack) reported on their progress on benchmark experiments and server construction, as well as plans for further development of high-resolution refinement methods.

PSI-2 Network Steering Committee: Target Selection

Wayne Hendrickson (Chair) and other members of the PSI-2 Network Steering Committee, discussed progress on target selection. Most of the largest Pfam sequence families with no known structures have been apportioned among the four large-scale centers and good progress is being made in determining representative structures from each family. It is becoming clear, however, that there are some families (MEGA-families) which have hundreds of thousands of members. An active subject for discussion revolves around the strategy for target selection within such very large families. Should the structures of several representatives within the family be determined (i.e. not just a single family member)? It is understood that such representatives will have amino acid sequences that are not more than 30% identical. Other than this requirement, how should multiple family members be chosen? No definite plan of action was proposed, pending further discussion.

Plans for the PSI Knowledgebase

The original RFA for the PSI-2 included plans for a knowledgebase to serve as a centralized information analysis and dissemination center. A proposal has been circulated by S.K. Burley, W.A. Hendrickson, A. Joachimiak, G. Montelione and I. Wilson for implementation of the PSI Knowledgebase (copy attached). This proposal is strongly supported by each of the large-scale centers and is intended to establish the Structural Genomics Knowledgebase without further delay. The proposal was discussed at length both in the open meeting on December 7 and the closed meeting of the Advisory Committee on December 8, 2006.

The Protein Structure Initiative Advisory Committee is unanimously in support of this proposal. As has been noted in previous reports, the Advisory Committee remains concerned at the ongoing need to make the results of the PSI more transparent and more available to the biological and biomedical community at large. It is critical that the PSI Knowledgebase help address this need, and do so as soon as possible. In this context the Committee suggested that one of the modules of the proposed PSI Knowledgebase be renamed, or a new module added, to focus specifically on outreach.

There was extensive discussion regarding the role, the responsibilities and the authority of the proposed Chief Operations Officer (COO). Assuming that the initiative is approved by NIGMS Council, the COO needs to be appointed and funded in such a way that he/she is independent of, and does not feel obligated to any one of the large-scale center directors.

Report submitted by Protein Structure Initiative Advisory Committee Chair Brian Matthews.