CanDLe Data

What CanDLe data is available? Where is the data stored and accessed?

The CanDLe project has been ethically approved and enables the Cancer Institute NSW to create 2 linked datasets (CanDLe 1 and CanDLe 2 Women Screen), based on two cohorts:

  1. CanDLe 1
    A primary cohort of people diagnosed with or treated for cancer in NSW from the NSW Cancer Registry, and a subset of the NSW Admitted Patient Data Collection (APDC) with a malignant condition in the diagnosis code.

  2. CanDLe 2 Women Screen
    All women in NSW who participated in either breast or cervical cancer screening, all women who were diagnosed with or treated for cancer in NSW from the NSW Cancer Registry, and all women in a subset of the NSW Admitted Patient Data Collection (APDC) with a malignant condition in the diagnosis code. 

The population health and administrative data collections that are included are shown below:


 


Variable lists and data dictionaries

The CanDLe datasets include over 200 listed variables (XLS). For further details on the variables please see the data dictionaries for each dataset below:



Where is the CanDLe data stored and accessed?

CanDLe datasets are stored, accessed and analysed in a secure environment, which enables appropriate monitoring and control of data access and use. Currently, CanDLe data is available in two approved secure environments:

  1. Secure Unified Research Environment (SURE)
    SURE is a remote-access computing environment that allows researchers to access and analyse linked health-related data files for approved studies. The SURE is provided by the Sax Institute. For more information, please visit the Sax Institute website or see the Introduction to SURE

  2. UNSW E-Research Institutional Cloud Architecture (ERICA)
    ERICA is a secure cloud computing infrastructure for individuals working with sensitive data. The UNSW ERICA instance is approved for CanDLe. For more information, please visit the UNSW ERICA website

 

Costs of the secure environment

Each research group will be required to fund the storage and access costs charged by the secure environment provider. The Cancer Institute NSW will fund the costs of the data linkage. 

 
Managing CanDLe Datasets

The Cancer Institute NSW is responsible for managing the master dataset and allocating access to Lead Researchers via Project Folders specific to each approved sub-study protocol.

 
Confidentiality

All data linkage will be conducted by the Centre for Health Record Linkage and will adhere to strict guidelines that ensure that privacy and security of data is maintained.

Only variables approved by data custodians will be included in CanDLe. To minimise potential confidentiality risks, sensitive or personally identifying information will not be included in datasets.

Lead Researchers will be responsible for ensuring that research findings are presented in aggregate form, with sufficiently large cell sizes (suppressing cell sizes <5), to ensure that no individual can be identified in peer review publications, conference presentations and the public domain. All draft reports must be reviewed by the CanDLe Co-ordinating Principal Investigator prior to submission to publication or public presentation.