Aller au contenu principal
Retour au site principal de la Bibliothèque de l'ÉTS
 

Research Data Management (RDM): Directory Structures

Why are structured directories important?

Now, let’s pretend that you store everything on your computer in one single folder – some of us are probably known to use our desktops for this. Imagine how long it would take you to find data you collected on a specific day a few years ago.

Image by Memed_Nurrohmad from Pixabay

Instead of keeping every document in a single place, we often organize our files using directory or folder structures. This helps us save precious time and improve our productivity. Organizing folders can also help us collaborate more effectively by ensuring that everyone can find the files they need.

Directory Hierarchies 101

A typical directory structure is composed of :

  • root directory (i.e. top-level folder)
  • subdirectories (i.e. subfolders)
  • relevant files

Usually, we separate data, analysis, and reports into stand-alone subdirectories under the project’s root directory. The structure looks like this:

├── Project-Folder/
|   ├── Experiment-Data/
|   |   ├── File-1
|   |   ├── File-2
|   ├── Experiment-Analysis/
|   |   ├── File-1
|   ├── Experiment-Report/
|   |   ├── File-1
|   |   ├── File-2

Note!

Directory names are frequently followed by a slash / to differentiate them from files.

Question

Which ones in this example are root directories? What about subdirectories?

README Files and Data Dictionaries

README files and Data Dictionaries - containing a brief description of the major folder contents, naming conventions, and data structure.

They are critical for transparency and reproducibility because they allow others to easily understand the contents of your directory and data without needing to ask the creator. This is especially helpful when working with a group or sharing directories with others.

Image by Mohamed Hassan from Pixabay

Two types of files needed to store all metadata

1. _README file which resides in our root directory and elaborates on the contents of our folder structure, discusses how, where, and who did the data collection. A 2nd file is placed in the sub-directory containing the data and explains how, where and by whom the data was collected. (Copeland, C., 2021).

2. _DATA-DICTIONARY file that resides in our data directory and elaborates on how our data variables are defined and described

The stucture looks like this:

├── Project-Folder/
|   ├── _README.md                  <--
|   ├── Experiment-Data/
|   |   ├── _DATA-DICTIONARY.md     <--
|   |   ├── _README.md 
|   |   ├── File-1
|   |   ├── File-2
|   ├── Experiment-Analysis/
|   |   ├── File-1
|   ├── Experiment-Report/
|   |   ├── File-1
|   |   ├── File-2

Naming

Readme files and data dictionaries should be the first things you look at when looking at any directory or folder, as this is your guide to its contents. Therefore these files should

  • Be prepended with an underscore “_”. This will push these files to the top of the directory for easy access;
  • Be in all caps, so they really stand out.

Format

Readme files and data dictionaries should be written in plain text, for this will ensure that the files describing your project can be opened on any computer. You will often see readme files called _README.txt or _README.md.

Exercice #1

Image by OpenClipart-Vectors from Pixabay

Say you’re in COM 110 and you’re working on your research project. You have files that looked like the following before submitting our final assignment:

Tremblay_20210921_COM110RProject_ph-data.csv
Tremblay_20210922_COM110RProject_ph-data.csv
Tremblay_20210923_COM110RProject_ph-data.csv
Tremblay_20210924_COM110RProject_ph-data.csv
Tremblay_COM110RProject_Analysis_V0.xlsx
Tremblay_COM110RProject_Figure-freq-plot_V0.png
Tremblay_COM110RProject_Figure-linear-reg_V0.png
Tremblay_COM110RProject_Figure-linear-reg_V1.png
Tremblay_COM110RProject_Lab-report_V0.docx
Tremblay_COM110RProject_Lab-report_V1.docx
Tremblay_COM110RProject_Lab-report_V2.docx
Tremblay_COM110RProject_Lab-report_V3.docx

Assuming that this project was going to be short and simple, no descriptive files were created. It can also be observed that the files were saved at the same level, and that a few extra files could quickly make the project difficult to manage.

Let’s put them into structured folders! Please copy the template and use it for your exercise:

├── example/
|   ├── example/           
|   |   ├── example

Psst! Possible solutions at the bottom of the page!

Exercice #2

A file directory is a constantly evolving entity. As the days of your research project go by, your structure will have to adapt to your new needs.

Let's say that on Day 1 of your research project, your directory looks like this:

 

After an initial pilot project, you have collected data on students' favourite juices and it looks like this on day 2:

You decide to add datasets to your project by researching student snack preferences while continuing to collect data about juice.

What do you need to add to your directory to facilitate the integration of the new data?

Psst! Solutions at the bottom of the page

Congratulations!

Now you know how to create a structured directory for your files!

Photography by Paul Stachowiak from Pixabay

Now you can take the time to organize your personal or team files!

Solutions

Exercice #1

First, create the "COM110ProjetR" root directory.

COM110ProjetR/

In this directory, there will be one file and 4 sub-directories:

 _README.txt
Data/ 
Analysis/
Figures/
Reports/

In the data subdirectory, create a data dictionary and a _README file.

_DATA-DICTIONARY.txt
_README.txt
Tremblay_20210921_COM110ProjetR_ph-data.csv
Tremblay_20210922_COM110ProjetR_ph-data.csv
Tremblay_20210923_COM110ProjetR_ph-data.csv
Tremblay_20210924_COM110ProjetR_ph-data.csv

The final structure will look like this:

 

Exercice #2

Two folders should be added to the Data/ sub-directory, in which the data from the two collections will be deposited.

For each collection, a README file will be required, followed by a data dictionary.