Skip to Main Content

Research Data Management

This guide offers guidance and resources for managing research data in any discipline.

Licensing Data & Code

Licensing data and code is an important part of sharing your data. Even if the funder requires you to open or share your data and code, a license is a courtesy to those who might reuse the data so they know up front what they can do with it. It increases the likelihood that other researchers will consider your data and code for reuse, which is a positive for you. Simply releasing the data and code without a license creates ambiguity. Different countries have differing rules around data IP (Intellectual Property), and Copyright. Data itself (the raw numbers) can't be copyrighted, but databases, charts, and tables can. For example, in the US creativity is important, so maybe that table of sensor data can't be copyrighted. But in the EU the act of compiling the data is deemed sufficient for copyright. Since you have no idea who might be interested in using your data, it makes sense to provide guidance in the form of a license.

There are a few common terms which will help you to understand data licenses. Attribution is the requirement that someone reusing the data must give credit to the creator - you. Often it stipulates that it is required for any distribution, display, performance, or use in a new work. Copyleft requires that any new work derived from your data must be released under the same license. This can create problems when a researcher compiles data from several sources that have conflicting licenses. Non-commercial requirements limit a user to using the data for non-commercial purposes. That runs into the problem of what constitutes a commercial endeavor. A License is a legal instrument that enables the data owner/creator to provide permissions to other users to use the dataset under specific terms. Remember that to apply a license you MUST own the data. Be sure to check with your institution if you are not sure.

The Digital Curation Centre (DCC) has an excellent section on How to License Research Data, including a summary of the CC and ODC licenses. The ODC's Guide to Open Data Licensing includes sections on the legal framework in the EU, Canada, and US.

Choose a License

There are many licenses available to choose from. There is no one-size-fits-all solution. It depends on the data type, what you will or will not allow others to do with it, what the funder or institution requires, what your discipline uses, even what your publisher requires.  You can apply multiple licenses (dual licenses) as long as you have never granted an exclusive license, but this is usually encountered only with code and software as a means of circumventing license stacking. That occurs when there are multiple code snippets used in a project that have different licensing requirements.

There are two primary standard license groups for data. Creative Commons (CC) offers 6 licenses: CC-BY; CC BY-SA; CC BY-ND; CC BY-NC; CC BY-NC-SA; and CC BY-NC-ND. Open Data Commons (ODC) offers 2 licenses: ODC-By; and ODC-ODbL.  There is also releasing the data to the Public Domain which still requires a license, but one that states that you relinquish all rights to the data.  CC offers a Creative Commons Zero (CC0) option and ODC offers an Open Data Commons Public Domain Dedication and License  (ODC PDDL).

Some repositories may choose a license for your submitted data. In the case of LibraData, data submitters must acknowledge that the Libra Dataverse repository's default data usage license agreement for all uploaded materials is a Creative Commons Zero (CC0) Public Domain Dedication Waiver.

Regarding software and code, the Open Source Initiative (OSI) maintains a list of current licenses available for software and includes a FAQ page with sections on Open Source Basics and Distributing and Using Open Source Software. GitHub's Choose a License website provides a guided path for selecting a license depending on your situation.

See links to data and software licensing resources below.