Data Access

Terms of Use

Acknowledgement of Source

The Data User agrees to explicitly acknowledge the PanProstate Cancer Group in any oral presentation, abstract, peer-reviewed manuscript, or preprint that utilises this dataset.

Standard Citation Format

Any publication utilising this data must include the following standard citation text in the "Acknowledgements" or "Methods" section:

"Some of the data used in this study was provided by the PanProstate Cancer Group (http://panprostate.org/)."
Additionally, the Data User shall cite the primary descriptor publication: [Insert Data Release Citation].

Data Versioning Disclosure

Because variant annotations and other data are updated dynamically, the Data User must specify the exact data release version and access date within their publication's supplementary methods (e.g., "PPCG Data Release vX, accessed October 2026").

Research Use Only Limitation

The genomic variant data provided is intended strictly for research purposes. The data has not been cleared, approved, or certified by the FDA, EMA, or any other regulatory body for clinical diagnostics or therapeutic decision-making.

Alignment with Original Patient Consent

The Data User acknowledges that the genomic data was gathered under specific Institutional Review Board (IRB) or Independent Ethics Committee (IEC) approved protocols and informed consent forms signed by the clinical participants. The Data User agrees to restrict their data utilisation strictly within the bounds. SEE PUBLICATION FOR MORE DETAILS.

Local Ethics Approval Requirements

The Data User certifies that their specific research protocol utilising this dataset has been reviewed and approved (or granted an explicit waiver) by their own institution’s IRB, IEC, or equivalent human subjects protection committee. Documentation of this local approval must be maintained by the Data User and provided to the Data Provider immediately upon request.

Compliance with International Ethical Frameworks

The Data User agrees to conduct all research utilizing this dataset in strict accordance with recognised international ethical standards, including but not limited to the Declaration of Helsinki, the CIOMS International Ethical Guidelines, and the local data privacy regulations governing genetic data (e.g., GDPR, HIPAA, or equivalent national legislation).

Non-Re-identification

The Data User explicitly agrees not to attempt to identify, contact, or re-identify any individual tissue donor or participant from whom the cancer variant data was derived. This includes, but is not limited to, the cross-referencing of somatic or germline variant files (VCFs/BAMs) with public genealogical databases, voter registries, or external clinical data repositories.

Prohibition of Stigmatisation

The Data User shall not use the genomic variant data to generate claims, algorithms, or publications that promote racial, ethnic, or geographic stigmatisation. Research findings must accurately differentiate between somatic mutations (acquired in tissue) and germline variants (inherited) to prevent unintended demographic discrimination.

Somatic Variants and Baseline Clinical Data

Baseline Clinical Data and Somatic Variants (including Single Nucleotide Variants, Insertions & Deletions, Stuctural Variants, and Copy Number alterations) are available here.

Whole Genome Sequencing Data

Whole Genome Sequencing data is available from the European Genome-phenome Archive (EGAS00001002876).

Access to germline variants, new datasets/cohorts and full clinical data

Access to gremline variants, new datasets/cohorts that are under development and the full clinical data is by application to the PPCG consortium. Please complete this concept form and send to ppcg@icr.ac.uk.