What is Data Linkage?
Data linkage is a method of bringing information from different sources together about the same person or entity to create a new, richer dataset. The linkage of information from disparate information sources enables the construction of chronological sequences of events and when used at the macro level provide valuable information for policy and research into the health and wellbeing of the population.
Data linkage is done by assigning an identifying number to each person on a dataset and storing a set of links to all records for the person. The TDLU is responsible for creating and maintaining the links between the main state wide health data collections and other approved data sources in Tasmania. In bringing records together, the TDLU uses strict privacy preserving policies, protocols and procedures to ensure the security of the data and confidentiality of the individuals the records relate to. The information about the individual is not brought together in one place. It stays in the separate data collections and the security and means of access to the information in each data source remain unchanged.
The Separation principle
The key feature of the data-linkage model used by the TDLU is one of ensuring the separation of personal identifying information from service or clinical data. This approach is in support of National Health Medical Research Committee protocols that define linked datasets as non-identifiable. Using this 'Separation Principle' the TDLU operates under strict protocols which include;
- Identifying data is provided to the TDLU for linkage only;
- Such data is kept on a standalone computing server with no Internet or Intranet connectivity;
- Access to the room housing the computer is via security card, that is strictly controlled;
- Data stored on the server is encrypted;
- The TDLU holds no clinical data whatsoever; and
- Researchers have no way of accessing the personal identifying data held by TDLU.
Who is involved in data linkage?
The data linkage process involves three main stakeholders.
- Data Custodians – are effectively the 'owners' of data. Data custodians work within an organisation or agency (such as government departments) and are responsible for the collection, use and dissemination of data. Data custodians may manage administrative or research datasets and collect and store personal information (such as name, address, data of birth) as well as information about the person (eg. health diagnosis or treatment details).
- Researchers - are the people who use the anonymised linked data for the purpose of analysis and research. Research projects undergo an extensive application process and must be approved by a relevant Human Research Ethics Committee (HREC) as well as relevant data custodians.
- Data Linkage Units - are the organisations who link datasets together and create Linkage ID's which allow data from different sources and organisations to be linked together.
A network of Data Linkage Units exist as part of the Population Health Research Network (PHRN) with each State and Territory represented. There are a further two national Integrating Authorities that can perform data linkage within and between Commonwealth and State/Territory data collections. The two accredited integrating authorities in Australia are the;
- Australian Bureau of Statistics (ABS)
- Australian Institute of Health and Welfare (AIHW)
How is linked data used?
Research using linked data is very reliable and efficient as it uses data from the whole population not from small samples of the population. The linkages between administrative and research or clinical datasets provides an evidence base for policy makers and researchers to better understand population health and wellbeing and implement and evaluate service delivery and programs.
Access to linked data is subject to a comprehensive application process together with relevant human research ethics approvals. The TDLU is currently taking applications for linked data and some examples of projects underway in Tasmania include;
- Analysis of factors that contribute to Hospital Standardised Mortality Ratio results in Tasmanian hospitals
- The burden and cost of injury attributable to health care use and mortality in Australia
- Perinatal outcomes and child development (risk and protective factors)
- Factors that impede early access to defibrillation following out of hospital cardiac arrest
Research projects using linked data makes use of administrative, survey and research/clinical data that already exist. Utilising such data minimises the burden on organisations and individuals to provide additional information and it is a cost effective solution for researchers.