The Patient-Centered Outcomes Data Repository: Preparing and Depositing Your Dataset

ICPSR

/@icpsr

Published: December 10, 2021

Open in YouTube
Insights

This video provides an in-depth exploration of the Patient-Centered Outcomes Data Repository (PCODR), a specialized archive for clinical research data funded by the Patient-Centered Outcomes Research Institute (PCORI). The presentation, delivered by representatives from PCORI and ICPSR (Inter-university Consortium for Political and Social Research), outlines PCORI's comprehensive policy on data management and data sharing. The primary goal is to maximize the utility of PCORI-funded data as a scientific asset, encouraging rigorous secondary use to ultimately improve patient and population health. The speakers detail the policy's core features, the practicalities of depositing data, and the secure mechanisms for accessing it.

The policy emphasizes systematic creation and preservation of research data and documentation, which resides at the ICPSR-hosted PCODR. Key features include clear expectations for awardees, funding support for data preparation, specified data availability timelines, and a robust data request review process. A central tenet is the deposition of de-identified data in accordance with the HIPAA Privacy Rule, forming a "full data package" that encompasses not just the analyzable dataset but also the full protocol, metadata, data dictionary, analytic plan, and analytic code. The importance of informed consent, ensuring it permits secondary research purposes, is also highlighted, with PCORI working closely with awardees to align consent processes with the policy.

The video further elaborates on the governance and access mechanisms. Data generators (awardees) enter into a Data Contributor Agreement (DCA) with ICPSR, establishing their rights and obligations. For data requestors, an independent committee reviews applications based on scientific purpose, contribution to generalizable knowledge, feasibility, and requestor expertise, with strict prohibitions against re-identification and redistribution. Approved requestors sign a Data Use Agreement (DUA) and gain secure access to the microdata via a Virtual Data Enclave at the University of Michigan. This enclave provides analytic software but restricts printing and copying, with all output undergoing a disclosure review to prevent sensitive information release. The process from pre-deposit kickoff meetings to data curation, long-term preservation (in formats like ASCII, XML, PDF), and eventual release in statistical software formats (SAS, SPSS, Stata) is meticulously described, along with resources like checklists and tutorials to aid depositors.

Key Takeaways:

  • Core Policy Objective: PCORI's data management and sharing policy aims to maximize the scientific utility of funded clinical research data, treating it as a valuable asset for improving patient and population health through rigorous secondary analysis.
  • Comprehensive Data Package: Depositors are required to submit a "full data package" which includes de-identified data (compliant with HIPAA Privacy Rule), the full study protocol, metadata, data dictionary, analytic plan, and analytic code, ensuring comprehensive documentation for secondary users.
  • Informed Consent and De-identification: Adherence to the HIPAA Privacy Rule for data de-identification is paramount. PCORI actively collaborates with awardees to ensure informed consent processes are developed to explicitly allow for secondary research purposes, facilitating broader data utility.
  • Structured Data Governance: Data deposition is governed by a Data Contributor Agreement (DCA) between the awardee and ICPSR, while data access for secondary users is managed through a Data Use Agreement (DUA) with ICPSR, outlining specific terms, conditions, and obligations.
  • Timely Data Availability: Data is made available to the public upon the earlier of two triggers: the final research report being posted on PCORI's website, or the primary research results being published in a peer-reviewed journal, balancing dissemination with researchers' publication needs.
  • Rigorous Data Request Review: An independent committee, including data scientists, clinical researchers, PCORI staff, and patient representatives, evaluates data requests based on criteria such as scientific purpose, contribution to generalizable knowledge, feasibility, and the requestor's expertise.
  • Secure Data Access Model: Approved data users do not receive direct copies of the microdata. Instead, they gain credentials to a secure Virtual Data Enclave hosted at the University of Michigan, which provides analytic software but restricts printing, copying, and pasting, with all analytical outputs subject to a disclosure review.
  • Exemption Process for Unique Cases: The policy includes provisions for exemptions in situations where full compliance is not feasible, such as with proprietary data or specific informed consent limitations. Awardees must provide a written explanation for a case-by-case review by an internal PCORI team.
  • Financial Support for Data Preparation: PCORI provides specific funding, typically not exceeding $75,000, and incorporates milestones into existing awards to support the personnel costs associated with preparing and depositing the full data package.
  • Professional Data Curation and Preservation: After submission, data undergoes a multi-stage process including completeness review, professional curation (organization, quality checks), and long-term preservation in standard formats like ASCII for quantitative data and XML/PDF for documentation.
  • Transparency in Data Utilization: The PCODR promotes transparency by publicly listing the names of approved data users and publishing summaries of findings derived from the accessed data, fostering accountability and showcasing research impact.
  • Comprehensive Depositor Resources: ICPSR offers a suite of resources to assist awardees, including checklists for data preparation and HIPAA Privacy Rule compliance, a comprehensive roadmap of the deposit process, and tutorial videos for using the online data submission tool.

Tools/Resources Mentioned:

  • Patient-Centered Outcomes Data Repository (PCODR)
  • ICPSR (Inter-university Consortium for Political and Social Research)
  • MyData account (for online data deposit)
  • Checklist for preparing your data for deposit
  • HIPAA Privacy Rule Checklist (for de-identifying data)
  • Roadmap (overview of the deposit process)
  • Online tutorial video (for depositing data)
  • Virtual Data Enclave (for secure data access)

Key Concepts:

  • Patient-Centered Outcomes Research Institute (PCORI): A non-profit organization that funds comparative clinical effectiveness research and mandates data sharing.
  • Patient-Centered Outcomes Data Repository (PCODR): The specific data repository established by PCORI and hosted by ICPSR for archiving and sharing PCORI-funded clinical research data.
  • Full Data Package: A comprehensive collection of research materials required for deposit, including the de-identified dataset, study protocol, metadata, data dictionary, analytic plan, and analytic code, ensuring reproducibility and usability.
  • De-identified Data: Health information from which specific identifiers have been removed, in accordance with the HIPAA Privacy Rule, to protect individual privacy while allowing for research use.
  • Data Contributor Agreement (DCA): A legal contract between the data generator (PCORI awardee) and ICPSR that outlines the terms and conditions for depositing data into the PCODR.
  • Data Use Agreement (DUA): A legal contract between an approved data requestor and ICPSR that specifies the terms, conditions, and responsibilities for accessing and using restricted data from the PCODR.
  • Virtual Data Enclave: A secure, remote computing environment that allows approved researchers to analyze sensitive microdata without physically downloading or removing it from the host institution, thereby enhancing data security and preventing unauthorized re-identification.
  • Disclosure Review: A process applied to all outputs generated within the Virtual Data Enclave to ensure that no potentially re-identifiable or sensitive information is inadvertently released.
The Patient-Centered Outcomes Data Repository: Preparing and Depositing Your Dataset | IntuitionLabs.ai