Enhancing the Diversity of the NIH-funded Workforce Program

Diversity Program Consortium Data Sharing Policy
Approved: March 16, 2016​​

The NIH has long recognized the need for a biomedical workforce that reflects the most talented scientific minds across all populations. With that in mind, in 2012, it convened the NIH Advisory Committee to the Director (ACD) Working Group on Diversity in the Biomedical Research Workforce to explore strategies to attract, prepare and sustain the interest of individuals in the scientific workforce, including those from underrepresented groups (NOT-OD-15-053). In response to the Working Group’s recommendations, which were endorsed by the ACD, the Enhancing the Diversity of the NIH- funded Workforce Program was established. The program through three integrated initiatives, (1) Building Infrastructure Leading to Diversity (BUILD), (2) the National Research Mentoring Network (NRMN) and (3) the Coordination & Evaluation Center (CEC), envisaged a national collaborative through which program awardees would work collectively as a Diversity Program Consortium, and collaboratively with the NIH, to develop novel and effective diversity-driven approaches to infrastructure and faculty development, student engagement, research training and mentorship for each career stage with dissemination of lessons learned to the broader research training/mentorship communities.

The National Institutes of Health and its Enhancing the Diversity of the NIH-funded Workforce Program awardees, hereafter referred to as the “Diversity Program Consortium,” will develop, implement and evaluate approaches to strengthen institutional capacity to engage and prepare individuals, including those from underrepresented populations, for successful careers in biomedical research. This Data Sharing Policy (“Policy”), which has been developed in conjunction with Diversity Program Consortium’s awardees and the Executive Steering Committee (ESC), describes how data within the Diversity Program Consortium is collected, shared, stored, and utilized for purposes of the consortium-wide evaluation. The Policy will be incorporated by reference into the terms and conditions of each U54 award.

To facilitate a national cross-site evaluation as well as provide a mechanism for reporting programmatic progress on intended goals, each site within the Diversity Program Consortium will capture, at intervals defined by the ESC, a consistent or core set of individual/student-, institutional/site- and faculty/mentor- metrics. The scope of the core, consortium-wide data elements, known as the Hallmarks of Success, have been developed by the Diversity Program Consortium, and approved by the ESC. The consortium-wide data elements may be altered over time as determined by the ESC. Each site will also capture site-level data elements, in support of a local site evaluation, as determined by each individual site. The program’s cross-site evaluation will include collection of both qualitative and quantitative data elements.


The purpose of this Data Sharing Policy is to establish data collection, tracking and storage coordination requirements, to delineate specific administrative, technical and physical safeguards to assure data security and confidentiality, to describe access to and transfer of data to the Coordination and Evaluation Center (CEC) for use in the Diversity Program Consortium’s evaluation, and to provide a framework for use of Diversity Program Consortium data. This policy also delineates data ownership, rights, and grantee responsibilities. This policy is incorporated by reference into the terms and conditions of each U54 award, and compliance with the terms and conditions are required of all grantees.


The Data Sharing Policy outlining the Diversity Program Consortium data collection, sharing and dissemination, and analysis will remain effective five years after the end of the NIH Diversity Program Consortium awarded funding period (or period of no-cost extension) for each awardee.

2A. Policy Modification

This Policy is subject to change. The Diversity Program Consortium, through the ESC, which includes representation from each awardee and NIH, may propose modifications or extension of this policy for NIH’s review and approval. The original policy shall remain in effect until a revised Policy is approved by NIH.


The Coordination and Evaluation Center (CEC) will be responsible for overseeing implementation of the activities described herein, in collaboration with the ESC and NIH. NIH will be responsible for oversight and adherence of the Diversity Program Consortium to this data sharing policy.

The NIH is committed to protecting the rights and privacy of those whose information is collected during the conduct of its funded research and awardees will be responsible for compliance with this Policy, as outlined in the terms and conditions of each U54 award. This Policy is made under the NIH’s authority to conduct and fund research; to provide training/training assistance; to collect information as to the practical application of such research and training activities; to assemble accurate data to evaluate research priorities and scientific opportunities; and to maintain records in connection with these or other agency functions (42 U.S.C. §§ 241 and 282, and 44 U.S.C. § 3101). This Policy incorporates by reference the NIH’s data sharing policies for research and the Family Educational Rights and Privacy Act (FERPA) guidelines on use or disclosure of student educational records in the conduct of research, as amended [See: NIH Sharing Policies and Related Guidance on NIH-Funded Research Resources; Goals of Data Sharing; and Family Educational Rights and Privacy Act (FERPA) Link to external Web site].

As delineated in the CEC Funding Opportunity Announcement (RFA-RM-13-015), each “awardee will retain custody of and have primary rights to the data and software developed under these awards, subject to Government rights of access consistent with DHHS, PHS, and NIH policies. The CEC and the consortium will develop plans for data sharing among awardees. All de-identified evaluation-related data will be shared with the NIH at the conclusion of the award.” During the funding period, CEC will have responsibility for management and oversight of the aggregated dataset of consortium- wide data, and the individual sites will retain ownership over the use of site-level data.

All awardees and non-consortium parties (Section 7B) granted access to Diversity Program Consortium data will adhere to responsible data use, security and disclosure provisions as outlined in section 6, Data Security & Use. Failure to abide by the terms and conditions of this Policy including data security/disclosure provisions may result in (i) denial of further access to the Diversity Program Consortium data, (ii) denial of access to NIH-funded resources, and (ii) federal or state penalties. These terms apply to each awardee (CEC, BUILD and NRMN participants). Liability will be aligned with data ownership and rights (See also Section 8 for details regarding how disagreements/disputes will be resolved).


The Diversity Program Consortium is composed of awardees funded under one of the Enhancing the Diversity of the NIH-funded Workforce Program’s initiatives [BUILD, NRMN, CEC].

4A. Data Categories:

  • Consortium-wide Data: Data elements collected from each awardee to provide information required to complete the Consortium Wide Evaluation Plan (CWEP), and reflecting the goals articulated in the cooperative agreement funding opportunity announcement 1 to enable evaluation of intervention effects on outcomes defined by the Hallmarks of Success. Consortium-wide data will also include secondary data, including but not limited to institution records, demographic data, or other existing resources that are collected from all awardees as outlined in the ESC approved CWEP. Consortium-wide data elements will be submitted by all Member Institutions to the CEC, who will conduct quality review and risk assessment, de-identify data, and provide data for consortium use. Consortium-wide data, when submitted to the CEC, aggregated, and de-identified, is under management and oversight of the CEC on behalf of the ESC (hereafter referred to as DPC Data) 2. DPC data is accessible to all members of the consortium and is subject to the terms of this Data Sharing Policy. The Publications and Presentations (P&P) Policy, developed by the P&P subcommittee and approved by the ESC, outlines the procedures for consortium-wide data use. 3   See Data Elements for a detailed listing of Consortium-wide data elements.
  • Site-Level Data: Data elements collected by individual sites to evaluate the impact of site-level variables on outcomes of interest to the site. Site-level data includes both consortium-wide data elements (defined by the Hallmarks of Success and the consortium-wide evaluation plan as the data elements collected across all consortium sites) and non-consortium-wide data elements (defined as data collected only at individual sites). Member Institutions retain ownership of the use of site-level data and the publication of site-level analyses. Analyses and publications of site-level data will follow the process for tracking and review outlined in the Publications and Presentations Policy. Once the site-level data is aggregated with data from all sites and deidentified, it becomes classified as consortium-wide data for consortium use (see above), and subject to the terms of this Data Sharing Policy.
  • Third Party Data: Data collected from BUILD site partner institutions or NRMN subawardees, which can include both consortium-wide data elements and non-consortium-wide data elements. Third party institutions retain ownership of the use of their data unless and until the data is de-identified and aggregated as consortium- wide (see above). Third party data are subject to the terms of this Data Sharing Policy for all consortium-wide data elements, unless otherwise agreed upon in writing between a Member Institution and the Third Party that predates this Data Sharing Policy. In the event that such Third Party agreement does not allow for the sharing of data as described in this Data Sharing Policy, the Member Institution shall attempt to secure permission for the sharing of Third Party Data consistent with the objectives of the Diversity Program Consortium.

4B. Consortium-Wide Data Description

Consortium-wide data, under the following broad categories, will be collected during the funding period by the Diversity Program Consortium: (1) student/mentee, (2) institutional/site, and (3) faculty/mentor [see Data Elements for details regarding data elements to be collected].

  1. Student/mentee: data elements collected by sites/awardees to enable evaluation of intervention effects on student/mentee-level hallmarks and outcomes
  2. Institutional/site: data elements collected by sites/awardee to enable evaluation of impact of interventions on institutional-level hallmarks and outcomes.
  3. Faculty/mentor: data elements collected by sites/awardees to enable evaluation of impact of activities on faculty/mentor-level hallmarks and outcomes.

Consortium-wide data may include, for example, student-participant characteristics (e.g. information from education records), faculty/mentor characteristics (e.g. time elapsed since degree completion, authorship/publication record, history of NIH vs. other sources of grant funding) and institution characteristics (e.g., geographic location, diversity of faculty/student population, number of grants submitted vs. funded, summary data on trainees enrolled in STEM majors vs. completed degrees in STEM fields) as well as interview and survey-derived data (e.g. demographics). It may also include tracking of student/participant and faculty/mentor participation in online and face-to-face services/resources (e.g., faculty e-mentorship training modules, student e-mentoring sessions).

4C. OMB and IRB Approvals/Clearance

Consortium-wide data collection instruments (e.g. questionnaires, surveys, scripts, interviewer instructions, etc.), processes, and schedules for data collection developed on behalf of the Diversity Program Consortium will adhere to those approved by the Office of Management and Budget (OMB) as well as those of the UCLA-CEC Institutional Review Board (IRB), or other relevant IRB governing the consortium-wide data collection. All site-specific data collection instruments and procedures must adhere to the governance of the local site IRB.


The CEC will provide Diversity Program Consortium awardees with a process for submission of the consortium-wide data, which will include a description of data to be submitted, submission timeline, and access to the secure consortium data repository for the transfer of data. This process will include quality assurance activities to be completed by awardees and quality review to be completed by the CEC. Once consortium-wide data is submitted to the CEC, the CEC will perform quality review, risk assessment and de-identification. Quality review will include confirmation of identifiers for linking with other consortium-wide data, assessment of valid values, explanation for missing data, and completion of logic or skip patterns. Initial quality review will be completed within 10 days unless otherwise agreed upon. Disclosure risk assessment will include review for sensitive and infrequent (rare) data points that could be used to identify individuals. For both quality reviews and disclosure risk assessments, the CEC will work with each institution to resolve any outstanding issues with data quality. Diversity Program Consortium awardees will work collaboratively with CEC to meet Consortium-defined standards of data completeness and quality.

Diversity Program Consortium awardees may choose to share additional site-specific data with other awardees and/or the consortium for various analyses and collaborative research opportunities. The CEC will provide a portal for secure storage of site-level data that will not be shared with Diversity Program Consortium and also a procedure for sharing site-specific data elements, should a site choose to do so.


6A. Data Security

To protect the rights and privacy of individuals whose information is collected within this multisite and collaborative research project, and to ensure the confidentiality of the data to be shared, all parties under this Policy agree to adhere to the following:

  • The CEC Tracker 4 will assign each individual identified as a participant (both students and faculty) a unique nine- digit numeric identification number, hereafter called the cross-site ID, at the time the Diversity Program Consortium awardee submits the roster of participants to the CEC. This will allow the CEC to maintain longitudinal data as each individual progresses through their career. The CEC Tracker will also allow authorized BUILD site administrators to add identifying elements to the Tracker to assist with longitudinal tracking (e.g., site-level identification numbers). Restricted information that identifies participants (such as name, address, student/faculty institutional ID number) will be maintained by the CEC Tracker and managed by the UCLA Computer Technology Research Lab (CTRL). In order to maintain confidentiality of individuals, identifiable information on BUILD participants will only be provided to authorized educational officials at individual awardee institutions; identifiable information will not be shared across BUILD and NRMN sites.
  • The physical security of the CEC’s Tracker will be maintained at the CTRL with multiple levels of security compliant with HIPAA’s data security standards. The Tracker will operate behind a firewall on a private IP space that is inaccessible to regular Internet traffic. Access to the CEC Tracker will require authentication with a virtual private network (VPN) appliance in addition to CEC Tracker web application account verification.
  • All consortium-wide data will be de-identified prior to use in analyses or publications. Individual sites may maintain identifiable information on their own participants and site-specific data, which is subject to the governance of their local IRB.
  • All Diversity Program Consortium awardees will implement, maintain, and use appropriate administrative, technical, and physical security measures to preserve the confidentiality and integrity of physical data to include storage in a secure and locked location; all electronic data collected by the awardee will be stored and maintained in a password-protected directory maintained behind the institutional firewall of the awardee with password access granted only to approved staff or officials or, for shared data, in the secured consortium data online repository.

6B. Data Use

Use of the data collected under this Policy is to support evaluation of DPC funded research, its activities and services. Accordingly, all data collected reviewed, and aggregated at the consortium level, will be de-identified and free of individually identifiable information that would allow linkage to individual participants prior to distribution to consortium members, non-consortium parties and the general public. As an added measure to ensure the integrity and security of the consortium data, the CEC will oversee data requests, release/transfer, and use as approved by the Publications and Presentations Policy from the ESC.

Unauthorized use or disclosure of consortium data should be reported to the CEC and the NIH within one day of discovery and should include both corrective actions planned to prevent future unauthorized use/disclosure as well as efforts to mitigate any adverse effect of the unauthorized use/disclosure. Reports to NIH of unauthorized use or disclosure of consortium data should be reported to the Program Official, who will implement appropriate review procedures within NIH. Diversity Program Consortium awardees agree not to use or disclose the data except as permitted or required under this Policy and/or required by applicable law.


The NIH and the DPC recognize that data sharing, in multisite and collaborative research, requires compliance with organizational policy, IRB guidelines, and local, State and federal laws and regulations to safeguard participant privacy and ensure data protection. This Policy describes access to various categories of data covered under the Policy. To share DPC data, awardees will use a secure file transfer service over an encrypted connection only. Additionally, physical and/or electronic research record data will be maintained securely, retained for up to 5 years following the end of the program funding period.

7A. Data Access by Consortium Members

Data access for use in consortium-wide publications and/or presentations by consortium members will be managed through the Publications & Presentations (P&P) subcommittee within the Executive Steering Committee and governed by the P&P Policy. Requests for data access must be made using the process outlined by the P&P subcommittee and detailed in the P&P Policy. The P&P subcommittee will review and approve all requests for the use of consortium-wide data in research studies, publications and presentations, and approve the publication and presentation of results from research using consortium-wide data. Once the P&P subcommittee approves proposed data use, the CEC Data Coordination Core will be notified and will provide secure access to requested data. The de-identified data will be released through licensed access via password protected sections of the CEC public website.

Each site will have a username and password allowing full access to their own site-specific data, which is critical for their internal operations and local presentations. Scholarly activity using site-specific data must be shared with the P&P subcommittee for purposes of tracking and archiving all Diversity Program Consortium publications and presentations.

7B. Data Access for non-Consortium Parties

Access to consortium-wide or site-specific data for parties outside of the Consortium will require the outside party to submit a Data Request Form describing the proposed use of the requested data and identifying a sponsor for the request that is a member of the Consortium. Access to site-specific data from an Outside Party requires sponsorship by the PI of that site. Documentation of Human Subjects Ethics training and Institutional Review Board approval or exemption should be provided with the data request. Outside Parties shall agree to use requested Data only for approved use. All such requests will be reviewed by the P&P subcommittee according to the P&P Policy. In addition, within 24 months of the termination of funding the de-identified data will be made available for public use, consistent with the NIH data sharing agreement for NIH funded research, through an open access portal on the CEC public website.

7C. Data Dissemination

The processes for disseminating research findings stemming from data generated by the Diversity Program Consortium will be developed and implemented by the CEC in collaboration with the P&P subcommittee. The overall dissemination plan will include the following elements: timeline (within and external to consortium, during grant funding, post grant- funding period, etc.); dissemination methods (e.g., presentation at national scientific meetings; publications in peer-review journals or other publications; webinars sponsored by the Consortium; short articles in Consortium -sponsored newsletters/websites, etc.; Annual or Bi-annual grantee conferences; presentations at community organizations, etc.). This dissemination plan will be established by the ESC.


As per the Cooperative Agreement Terms and Conditions, any dispute that arises between awardees and the NIH on scientific/programmatic matters, within the scope of the awards, will be brought to a Dispute Resolution Panel. The panel will be composed of an Executive Steering Committee Awardee designee chosen by the Diversity Program Consortium, one ESC NIH designee, and a third designee, to be chosen by the other two panel members, with expertise in the topic area relevant to the dispute. The panel will convene a meeting and work with the parties to achieve a resolution. To the extent permitted under the terms and conditions of an awardees award, an awardee retains the right to appeal an adverse action, beyond this special dispute procedure, in accordance with PHS regulation 42 CFR Part 50, Subpart D and DHHS regulation 45 CFR Part 16.

1. BUILD (RFA-RM-013-16), NRMN (RFA-RM-013-017), CEC (RFA-RM-013-15)

2. DPC data refers to the comprehensive data set comprised of consortium-wide data across all awardee institutions.

3. The Publications and Presentations Policy will articulate a separate and unique process for tracking and review of each type of data (which can range from no review or simple notification, to comprehensive review). Further discussion is needed among the ESC, to define the process for tracking and review for each category of data (site-level, and consortium-level data), and therefore is still under development.

4. The CEC tracker is a tool developed by the CEC and utilized by the consortium to upload, collect, store, and manage consortium data.

