What is a README?

A README file provides information about a project and helps ensure that data can be correctly interpreted by you and others when sharing or publishing datasets. It contains information commonly required to understand the dataset, its contents, provenance, licensing and how to interact with it. This helps maximize your dataset’s usability and long-term preservation potential. A README file is generally named README and is typically a text or markdown file.

In short, a README is a portable, durable way to inform other researchers about how to navigate, collaborate, or extend your project. Having them alongside your project(s) is good practice, especially when depositing data in a repository.

Looking for a cheat sheet? Check out our one-pager
Looking for a template to reuse? Check out our README template

Table of contents

The Process of Creating a README
Stylistic Considerations of a README
Recommended Content
Sample README

Warm-Up: Exercise 1

Let’s access this dataset:

Davis, Matthew, 2024, “Soil Adsorption Curves and Environmental Soil Data”, https://doi.org/10.5683/SP3/JGRIN0, Borealis, V1

Take a look at the project and try to answer the following:

You got new data for ammonia absorption into soil from Corktown, how would you go about generating the new model and plot?
You want to recreate the whole project, what package(s) do you need to have installed to rerun everything?
You have a question about this project, who and how can we reach the corresponding author?

The Process of Creating a README

Consider creating a README file at the start of your project, or at least preparing a README before your project goes public, and continually updating it so you don’t lose any details. One README file can be made for a dataset or a set of files that are of the same or similar formatting. If multiple READMEs are necessary, format them the same to maintain consistency.

Place the README at the root directory of the project (check out our workshop on directory structures) so it can be one of the first files people will look at.

You can use any text editor to create a README. Markdown or TXT formats are commonly used because you can add lightweight formatting, and it is non-proprietary. Using a plain text format helps preserve your information because it relies on durable, open standards rather than proprietary formats.

Some other common formats you might see are R Markdown (common in R projects) and reStructuredText (common in Python projects).

Stylistic Considerations of a README

How you write your README is as important as the information you include. You should be as clear as possible.

Here are some best practices for data documentation you may consider:

Be as clear and specific as possible, including descriptive titles
Don’t use jargon
Define terms, acronyms, and/or abbreviations
Address any limitations
Address any quantities, multiples, versions, and/or updates

Recommended Content

Every project is different, so consider which of the following applies to your project. For example, a software project will have different sections than an academic research project.

Element	Details
Information	Include at least two contacts. This could be the principal investigator and a co-investigator or another author. Include names, associated institutions, institution emails, and ORCIDs if available.
Description	Provide detailed context as much as possible – indicate what your project does and what your dataset contains. Give your README a descriptive title and include any dates that may be helpful, such as creation (both README and dataset), updates, data collection, etc. Using a standard date format, like ISO 8601 (YYYY-MM-DD or YYYYMMDD), is a good practice.
Methodology	Provide information about your research protocols such as data collection, data processing and analysis, sampling, instruments, tools, and software (include version and any special requirements for installation and operation), other sources used, geographic information of data collection, any standards followed, etc.
Data and File Overview	Describe the file structure of the dataset, such as the files or folders applicable for dataset organization and the relationship between the files. You should also indicate any other file relationships, such as if there are multiples, different versions, or modifications, and explain why if necessary.
Data-Specific Information	Define and describe any labels, codes, variables, abbreviations, and/or acronyms. Also, mention if any special formats are applicable. This can be repeated and modified for each dataset when appropriate.
Sharing and Access Information	Indicate the appropriate licences for your project. Also mention any restrictions, permissions, and/or data confidentiality conditions. You may also wish to include relevant links to other supporting datasets, locations where to access the dataset, or publications that cite or use the dataset.
Acknowledgements	Acknowledge your research team members, assistants, staff, and students who also had a contributive role in your project. Also include information on the funding support for your project, such as the name of the organization, grant name and number, fellowship name, awards, etc.

Sample README

Download this sample README .

We created an example of a README for the project we looked at earlier in the warm-up exercise. This sample file was designed to capture academic research projects. There are other samples available for other types of projects (software, data science, etc.).

NOTE: The contents in the example template are made up for educational purposes, and do not reflect what the real study had in mind.

Exercise 2

Now, let’s practice what we just learned.

From the sample template, what are some things that you noticed that may be specific to academic research? What are some things you would include if this were a software project?

Here is a breakdown of what we covered: READMEs are important documents containing information about your project’s data. They help ensure your data is correctly interpreted and navigated for you and others in the future when revisited. READMEs should be created at the start of your project and maintained throughout, and include content that is appropriate to your project. Lastly, it’s best to write your README in a non-proprietary format, like TXT or Markdown.

Congrats!

Hooray! You are now ready to write up a good README file so you and other researchers can understand your project with no problems.

Sources

Cornell Data Services. Writing READMEs for Research Data. https://data.research.cornell.edu/data-management/sharing/readme/
Harvard Biomedical Data Management. README Files. https://datamanagement.hms.harvard.edu/collect-analyze/documentation-metadata/readme-files
Princeton Research Data Service. READMEs for Research Data. https://bit.ly/4lv23t3
The Geneva Graduate Institute. README.txt. https://libguides.graduateinstitute.ch/rdm/readme
UBC Library. Research Data Management Data Guide. https://bit.ly/3HtrzM8

Need help?

Please reach out to research.data@ubc.ca for assistance with any of your research data questions.