What is Data De‑identification?
Introduction to data de‑identification
While sharing research data supports transparency and reproducibility, researchers must take careful steps to ensure data is managed ethically, responsibly, and securely. Data de‑identification involves removing or modifying identifying information to reduce the risk that individuals, communities, or animals can be re‑identified, helping protect privacy and confidentiality while complying with ethical and legal obligations. If sensitive information is not properly de‑identified and becomes exposed, it may lead to significant harm, including the unintended disclosure of identities and potential negative impacts on those represented in the research.
Terminology
In UBC terminology (Information Technology Standard U1), sensitive data is classified as medium risk, high risk, or very high risk. For the purposes of this workshop, however, we will use the term “sensitive data” to align with the terminology in Sensitive Data: Practical and Theoretical Considerations (Rod & Thompson, 2023, pp. 251-273). In this context, these terms will be treated as equivalent.
Governance of sensitive data
Who governs sensitive data? It is most often that the legalities, policies, and regulations of sensitive research data are governed on a provincial, territorial, and/or institutional level. Sometimes, there may be instances where the federal government is involved.
Indigenous research data
There are specific considerations and protocols for Indigenous research data collection, use, and sharing. Ensure that you’re engaging with Indigenous communities, collectives, and/or organizations to have your project align with Indigenous data sovereignty.
- UBC’s Indigenous Research Support Initiative (IRSI) published its own principles for Indigenous data governance. Please visit the IRSI website for more information related to engagement principles, data governance, and ethics
- The First Nations Principles of OCAP for data governance by the First Nations Information Governance Centre
- The National Inuit Strategy on Research
- The Principles of Métis Ethical Research
- The CARE Principles for Indigenous Data Governance
Three interdependent workshops on Data de-identification and anonymization:
In these three workshops, we will introduce the fundamentals of data de-identification. The first session covers key concepts and practical definitions, the second focuses on manual techniques for de-identifying data, and the third explores software-based approaches to data de-identification. Please note that all workshops are introductory; for more advanced guidance, refer to the links provided at the end of each session.
Need help?
Please reach out to research.data@ubc.ca for assistance with any of your research data questions.
Table of contents
Loading last updated date...