Aller au contenu principal
Retour au site principal de la Bibliothèque de l'ÉTS
 

Research Data Management (RDM): File Naming

Why is file naming important?

Creating a well-organized hierarchy of files with clear naming conventions :

  • allows to stay organized;
  • facilitates file identification;
  • makes it easier for coworkers to navigate through files.

This is especially important if you :

  • are working with large data sets;
  • have complex output files;
  • need to coordinate file access among multiple people from different institutions.

There are many ways to structure your folders, and multiple naming conventions you can use.

The key is consistency.

A file name should :

  • be descriptive;
  • provide information about the date;
  • specify the document version.

The best practice is to consult with your lab or colleagues BEFORE THE START OF THE PROJECT to develop a naming schema that everyone is willing to follow consistently.

Let's test your knowledge

"Documents". par xkcd sous la licence CC-BY-NC.


Here are some examples of file names without naming conventions:

10_data 2.txt

figure 1.png

final review.docx

Félix schedule&plan 2022Jul9.xlsx

 

And here's how your files could look with a few naming conventions:

Library-WhoDoesWhat.txt

003_raw-data_2002-07-09.txt

fig01_conversation-length-vs-interest.png

20220709_interview-transcript_v01.docx

Principle #1

Machine-Readable

Illustration par Mohamed Hassan de Pixabay

Goals:

  • Characters in file names are handled correctly by all computer systems
  • Names are brief and easily searchable

To name a file, you will use :

  • Alphanumeric characters (alphabetic characters and Arabic numerials)
  • Element delimiters: _ (underscore)
  • Word delimiters: - (dash) and/or capitalize the first letter of each word (camel case)

And you will avoid:

  • Spaces and other special characters, such as: ~ ! @ # $ % ^ & * ( ) ` ; : < > ? . , [ ] { } ‘ “ |
  • undefined abbreviations.

Keep in mind the level of confidentiality of your files. Make sure that a search engine search does not allow you to come across sensitive files (for example, if you have acquired data on human subjects, the participant's name should not be in the file name).

A few examples:

[Element1]_[Element2]_[This-Is-A-Test].txt

 

[Element1]_[Element2]_[ThisIsATest].txt

Exercise #1

Let’s try improve the file names! Pick your favourite file name and make it more machine-readable!

Principle #2

Human-Readable

Kindle Ebook Adult Ereader E-book E-reader Tablet de Max Pixel sous licence CC0 Public Domain

Goals :

  • File names provide concise information
  • Names are easily understandable to anyone who accesses them in future

You will use :

  • Ideally 3 elements, 5 max
  • Simple hierarchical folder structures

Consider putting authors’ names in the file name. Put family names first followed by first names OR initials.

Write down your naming convention pattern and document it in your README file.

  • e.g. My naming convention is [ThisIsAnExample]_[AAAA-MM-JJ]_[###]_[version].[txt] 
  • Define acronyms, abbreviations and codes

Principle #3

 

Plays well with default ordering

Goals :

  • Names start with the element that is used to order the files
  • Version information is at the end

Decide the beginning of the file name according to how you want to sort and search for your files.

  • When using a sequential numbering system, use leading zeros to make sure files sort in sequential order.
    • e.g. 001, 002, 010, 011,...100, 101...
  • Order elements from general to specific to make searching easier
  • Use ISO standard 8601 for dates
    • YYYYMMDD or YYYY-MM-DD
  • Version information should be used as the last element
    • For version numbers, don't forget to use a leading zero for small numbers (e.g. v01, v10...)
    • If you are using words to describe your version, make sure you always use the same words (e.g. _raw, _processed, _composite)

Exercise #2

Your lab has a spectrometer that is measuring thermal emissions once a day for a year for your experiment. There are three people who take that measurement in the lab.

Please create a file naming convention for these files to reflect what you have learned about file naming in today’s session.

Psst! Solution hints at the bottom of the page

Exercise #3

As the professor of the COM110 course, you are asking your students to submit a written essay and to present it in front of the class at the end of the semester.

What naming convention do you want your students to use for their files to facilitate locating them in your folders once all the assignments are received?

Psst! Solution hints at the bottom of the page.

You succeeded!

Photographie par Paul Stachowiak de Pixabay

You now know how to organize files with your own naming convention. As long as your file names stay consistant and clear, you are ready for the next step!

Solution hints for exercices #2 & #3

Exercise #2

LabName_SPEC_YYYYMMDD_NAMEFirstname.txt

Example: LIVIA_SPECT_20230526_TREMBLAYLaurie.txt

 

Since we don't know the the laboratory share its server with another one, we could decide to have the laboratory's name as the first element of information.

The second element in this example, "SPEC," is an abbreviation for spectrometer. It would, of course, need to be defined in a README file.

The third element is the date in the format YYYYMMDD, followed by the fourth element, which is the last and first name of the person who took the spectrometer measurement.

By placing the date before the name of the person who took the measurement, we ensure that the files are sorted by date rather than by individual.

 

Exercise #3

CourseNumber_LastName_FirstName_DueDate_TypeOfWork.[txt]

Exemples:

COM110_LeveilleGauvin_Lily_20230606_FinalEssay.docx

COM110_LeveilleGauvin_Lily_20230606_FinalPresentation.pptx

 

First element : Course number, as it is possible that the professor teaches many classes.

Second element : Last and first names of the student. That way, all the work of an individual will be sorted together. By separating the name and the first name, it is easier to distinguish the two, especially when it comes to compound names.

Third element : Due date in the format YYYYMMDD.

Fourth element : Title or type of work

Depending on the desired sorting order, the 3rd and 4th elements could be inverted. As the title or the type of work could be left open to interpretation for the students, it would probably be more useful to order the papers by date rather than by alphabetical order of title or type of work.