Making your data “FAIR” is something you hear more and more. In the age of creating large volumes of data and a variety of datatypes, and after experiencing data management difficulties when running large scale projects, researchers realized that making their data Findable, Accessible, Interoperable and Reusable is becoming a necessity.
So you want to follow best practices of data management and want to make your data FAIR.Where do you start? With these four quick and easy tips you can start your FAIR journey today.
Describe the origin of your dataset in the metadata
A lot of datasets lack a description of their origin. Where is the data created? How? and by whom? Is the data processed? Or is it the raw dataset? These are important questions for other researcher which need to be answered before they can actually use your data for their research. A genomic dataset which is created with the Illumina Next Generation Sequencing platform is different from one that is created with a Roche sequencer. Maybe your pipeline is only compatible with a specific machine. Describing this in the metadata, a file which contains all the information about your dataset (so basically data about your data), will greatly help others with filtering on relevant and usable datasets. A good example to store this information is a readme file which you add next to your dataset.
Describe purpose of your dataset in the metadata
Why is this dataset created? What was your goal when you created this dataset? A lot of datasets are created for a single purpose. FAIRifying your data ensures that you start thinking about other possible purposes or applications for your data. Maybe while thinking about these purposes you can think of someone who can also benefit from your data? Sharing is caring.
Add license if needed
Is your data freely available for use? Or for what purposes can it be used? Maybe you do not want it to be used for a publication, but it can still help other researchers with narrowing down their searchfield. Or maybe you want to get something in return for the use of your dataset. By adding a license to your data, you ensure that no one can legally use your data for purposes for which it is not intended. Some examples for these licenses are:
- Public Domain Dedication and License (PDDL) — “Public Domain for data/databases”
- Attribution License (ODC-By) — “Attribution for data/databases”
- Open Database License (ODC-ODbL) — “Attribution Share-Alike for data/databases”
Each license comes with its own benefits. For example, the Public Domain Dedication and License is highly customizable to what you want your data to be used for, while the Open Database License ensures your work gets attributed to you and not someone else.