Clearinghouse for Training Modules to Enhance Data Reproducibility

In January 2014, NIH launched a series of initiatives to enhance rigor and reproducibility in research. As a part of this initiative, NIGMS, along with nine other NIH institutes and centers, issued a funding opportunity announcement (FOA) RFA-GM-15-006 to develop, pilot and disseminate training modules to enhance data reproducibility. This FOA was reissued in 2018 (RFA-GM-18-002).

For the benefit of the scientific community, we will be posting the products of grants funded by these FOAs on this website as they become available. In addition, we are sharing here other relevant training modules developed, including courses developed from administrative supplements to NIGMS predoctoral T32 grants.

NIH Rigor and Reproducibility Training Modules

These modules, developed by NIH, focus on integral aspects of rigor and reproducibility in the research endeavor, such as bias, blinding and exclusion criteria. The modules are not meant to be comprehensive, but rather are intended as a foundation to build on and a way to stimulate conversations, which may be facilitated by the accompanying discussion materials. Currently, the modules are being integrated into NIH training activities.

NIH Office of Disease Prevention (ODP) Course on Pragmatic and Group-Randomized Trials in Public Health and Medicine

 

This 7-part online course aims to help researchers design and analyze group-randomized trials (GRTs). It includes video presentations, slide sets, suggested reading materials, and guided activities. The course is presented by ODP's Director, Dr. David M. Murray.

Access the course and related materials

Statistical Topics for Reproducible Animal Research

Andrew W Brown and David B Allison, Indiana University School of Public Health-Bloomington; Tapan Mehta and Stephen Watts, University of Alabama at Birmingham - R25 GM116167​
Logo of Stats in the Lab. 

Preclinical research invol​ving animal models can be improved when appropriate experimental, analytical, and reporting practices are used. We produced a series of animated vignettes with quantitative experts and laboratory scientists discussing aspects of study design, interpretation, and reporting. Each vignette introduces viewers to key concepts that can stimulate the important conversations needed between quantitative experts and laboratory scientists to enhance rigor, reproducibility, and transparency in pre-clinical research.

Access the vignettes Link to external web site
​​

​The BD2K Guide to the Fundamentals of Data Science Series

Arthrobacter arilaitensis Re117 genome atlas. Credit: Wikimedia Commons. 
Arthrobacter arilaitensis Re117 genome atlas. Credit: Wikimedia Commons.

The Big Data to Knowledge (BD2K) Initiative presents this virtual lecture series on the data science underlying modern biomedical research. Since its beginning in September 2016, the webinar series consists of presentations from experts across the country covering the basics of data management, representation, computation, statistical inference, data modeling, and other topics relevant to "big data" in biomedicine. The webinar series provides essential training suitable for individuals at an introductory overview level. All video presentations from the seminar series are streamed for live viewing, recorded, and posted online for future viewing and reference. These videos are also indexed as part of TCC's Educational Resource Discovery Index (ERuDIte). This webinar series is a collaboration between the TCC (BD2K Training Coordinating Center), the NIH Office of the Associate Director for Data Science, and BD2K Centers Coordination Center (BD2KCCC).

Access the seminar series Link to external web site
View archived seminars Link to external web site

Cell Line Authentication Training

L​eonard Freedman, Global Biological Stan​dards Institute® (GBSI), R25 GM116155
Multiphoton fluorescence image of cultured HeLa cells with a fluorescent protein targeted to the Golgi apparatus (orange), microtubules (green) and counterstained for DNA (cyan).  Credit: National Center for Microscopy and Imaging Research  
Multiphoton fluorescence image of cultured HeLa cells with a fluorescent protein targeted to the Golgi apparatus (orange), microtubules (green) and counterstained for DNA (cyan). Credit: National Center for Microscopy and Imaging Research.

GBSI and its partners have developed an exportable "active learning" training module to reduce cell line misidentification, mislabeling, and contamination. This module contains highly interactive training units, including "back to the lab" exercises that will turn learning into practice by sending the trainees back into the laboratory to practice their skills. The importance of cell line authentication when using cultured lines will improve the credibility, reproducibility, and translation of preclinical research.

Access the course and related materials Link to external web site

Improving reproducibility of computational microbiome analyses

Patrick Schloss, University of Michigan School of Medicine, R25 GM116149
 

A series of 14 tutorials on improving the reproducibility of data analysis for those doing microbial ecology research. Although the materials focus on issues in microbiome research, the principles are broadly applicable to other areas of microbiology and science. This series of lessons will focus on the importance of command line practices (e.g. bash), scripting languages (e.g. mothur, R), version control (e.g. git), automation (e.g. make), and literate programming (e.g. Rmarkdown). These are the tools that are used by a growing number of microbiome researchers to help improve the reproducibility of their research. By completing the activities in the tutorials, you will be listed on the Reproducible Research Tutorial Honor Roll, which provides a certification of your training.

Access the Tutorials Link to external web site

Improving Reproducibility in Research

Aaron Carroll, Indiana University School of Medicine, R25 GM116146

In order to promote better training and ensure the reliability and reproducibility of research, we developed a series of webisodes (thematically related online videos) targeted at graduate students, postdoctoral fellows, and beginning investigators that will address critical features of experimental design and analysis/reporting.

 

Module 1: Experimental Design Learning Module Link to External website
The Experimental Design Learning Module focuses on the intricacies of designing research that is robust, with an eye towards making it reproducible. It is comprised of four distinct learning units: 1) Replication, 2) Randomization, 3) Pitfalls with Experimental Design, and 4) Measurement. Each of these learning units has one or more sub-topics that is the subject of an individual webisode.

Module 2: Analysis/Reporting Learning Module Link to External website
The Analysis/Reporting Learning Module covers the various factors that are critical to writing about research with enough clarity to ensure its reproducibility. Very few researchers are given formal education on how to properly report findings to support reproducibility. It is comprised of three distinct learning units: 1) Power and P-values, 2) Scientific Writing, 3) The Review Process. Each of these units has one or more sub-topics that are the subject of an individual webisode.

Society for Neuroscience Rigor and Reproducibility Training Webinars

Manny DiCiccio-Bloom, Rutgers University; Cheryl Sisk, Michigan State University, R25 DA041326
 
Webinar 1: Improving Experimental Rigor and Enhancing Data Reproducibility in Neuroscience Link to External website
Post-Webinar Discussion Questions Link to External website

The topics of scientific rigor and data reproducibility have been increasingly covered in the scientific and mainstream media, and they are being addressed by publishers, professional organizations and funding agencies. This webinar addresses topics of scientific rigor as they pertain to preclinical neuroscience research.

 
Webinar 2: Minimizing Bias in Experimental Design and Execution Link to External website
Post-Webinar Discussion Questions Link to External website

Investigations into the lack of reproducibility in preclinical research often identify unintended biases in experimental planning and execution. This webinar covers random sampling, blinding and balancing experiments to avoid sources of bias.

 
Webinar 3: Best Practices in Post-Experimental Data Analysis Link to External website
Post-Webinar Discussion Questions Link to External website

Proper data handling standards, including appropriate use of statistical tests, are integral to rigorous and reproducible neuroscience research. Training in quantitative neuroscience is a specific area of emphasis for the BRAIN Initiative, and rigorous statistical analysis methods are included in the recent Proposed Principals and Guidelines for Reporting Preclinical Research [PDF, 69KB]. This webinar covers best practices in post-experimental data analysis.

 
Webinar 4: Best Practices in Data Management and Reporting Link to External website
Post-Webinar Discussion Questions Link to External website

Efforts to enhance scientific rigor, reproducibility and robustness critically depend on archiving and retrieving experimental records, protocols, primary data and subsequent analyses. In this webinar, presenters discuss best practices and challenges for data management and reporting, particularly when dealing with information security and sensitive material; archiving and disclosure of pre- and post-hoc data analytics; and data management on multidisciplinary teams that include collaborators around the globe.

 
Webinar 5: Statistical Applications in Neuroscience Link to External website
Post-Webinar Discussion Questions Link to External website

How can neuroscientists improve their "statistical thinking" and make full and effective use of their data? This webinar covers common applications of statistics in neuroscience, including the types of research questions statistics are best positioned to address, modeling paradigms and exploratory data analysis. The presenters also share examples and case studies from their research.

 
Webinar 6: Experimental Design to Minimize Systemic Biases: Lessons from Rodent Behavioral Assays and Electrophysiology Studies Link to External website
Post-Webinar Discussion Questions (no longer available)

Common sources of bias in animal behavior and electrophysiology experiments can be minimized or avoided by following best practices of unbiased experimental design and data analysis and interpretation. In this webinar, presenters discuss experimental design and hypothesis testing for mouse behavioral assays, as well as sampling, interpretational bias and referencing in in vitro and in vivo electrophysiology recording studies.

 
Workshop 7: Tackling Challenges in Scientific Rigor: The (Sometimes) Messy Reality of Science Link to External website

This webinar explores practical examples of the challenges and solutions in conducting rigorous science from neuroscientists at various career stages. It focuses on development of the interpersonal, scientific and technical skills needed to address various issues in scientific rigor, such as what to do when you can't replicate a published result, how to get support from a mentor and how to cope with various career pressures that might affect the quality of your science.

edX Course: Principles, Statistical and Computational Tools for Reproducible Science

Xihong Lin, Harvard School of Public Health, T32GM074897-12S1
 

Learn skills and tools that support data science and reproducible research to ensure you can trust your research results, reproduce them yourself, and communicate them to others.

This free course covers fundamentals of reproducible science, case studies, data provenance, statistical methods for reproducible science, computational tools for reproducible science, and reproducible reporting science. These concepts are intended to translate to fields throughout the data sciences: physical and life sciences, applied mathematics and statistics, and computing.

Consider this course a survey of best practices that will help you create an environment in which you can easily carry out reproducible research and integrate with similar situations for your collaborators and colleagues.

Access the course Link to External website

​​