Outline
Statistics Canada releases Public Use Microdata Files (PUMFs) for the Census and many of its surveys. These de-identified files contain responses at the individual level and are a rich source of data for researchers.
This workshop demonstrates how to access, explore, and analyze Statistics Canada microdata using R. We will work with the Canadian Tobacco and Nicotine Survey but the content applies to any Statistics Canada PUMF, hundreds of which are freely available in Abacus, UBC Library’s data repository.
There are no prerequisites. Participants familiar with R and RStudio can follow along on their own workstations. For those new to R, the workshop showcases how R can be used for analysis and reproducibility.
If you are new to R and want to participate fully, please install R and RStudio before the workshop:
- Install R from https://cran.rstudio.com/
- Install RStudio from https://rstudio.com/products/rstudio/download/#download
- Follow along and run the code in this R Markdown file.
Learning objectives
At the end of this workshop, you will be able to:
- Describe how microdata differs from statistics, and how it can be used
- Find Statistics Canada Public Use Microdata Files (PUMFs) in Abacus, UBC Library’s data repository
- Interpret PUMF documentation, including user guides and data codebooks
- Appreciate how R can support your research process and improve reproducibility
Schedule
0:00 | Welcome and using Zoom |
0:05 | Introduction to microdata |
0:15 | Finding Statistics Canada microdata in Abacus (PUMFs) |
0:20 | Orientation to the Canadian Tobacco and Nicotine Survey (CTNS) |
0:30 | Example of data analysis and visualisation in R |
1:00 | Post-workshop discussion (optional, up to 30mins) |
Resources
- Microdata page from UBC Library’s Data and Statistics guide
- General information about Microdata from Statistics Canada website
- Statistics Canada Open License collection in Abacus (for PUMFs and other Statistics Canada datasets)
- Data guide
For support finding or working with data:
- Sophia Papandonatou, student Data Librarian, UBC Library sophia.papandonatou@ubc.ca
- Paul Lesack, Data/GIS Analyst, UBC Library paul.lesack@ubc.ca
- (currently on leave) Jeremy Buhler, Data Librarian, UBC Library jeremy.buhler@ubc.ca