Defining Research Data Management
Research Data Management (RDM) refers to a set of principles, processes, and best practices for handling research data both during and after the completion of a research project. The intention of this guide is to help researchers better understand and implement these principles and practices, including:
Following research data management recommendations helps make the researcher an effective steward of their research data, ensuring it is accessible, comprehensible, and fit for use and reuse.
How is the concept of research data defined? One commonly cited definition, used in federal regulations, is "the recorded factual material commonly accepted in the scientific community as necessary to validate research findings." We understand that data is a concept that extends outside the scientific fields, however, so we might choose to say that research data consists of materials or other sources that form the basis for your conclusions and serve to justify scholarly claims and/or findings.
Why do RDM?
Why should you work to improve your data management? One reason is compliance: the norms and expectations for research are shifting and requirements that data be stored and shared for later reuse are increasingly common and will eventually apply to most (if not all) federally funded research. Additionally, data management plans are now a required component of many grant applications.
However, it is our view that good research data management simply makes for better research. Data is now commonly viewed as a first-class research output alongside the journal article, and well-managed data both facilitates and reflects well-conducted studies. Some specific benefits of research data management can be found below.
Researchers will be familiar with the "research lifecycle" - the broadly accepted outline of steps that constitute the research process, from devising the research questions and securing funding to disseminating the findings via publication. The research data lifecycle is closely related to the research lifecycle. Different models include or emphasize different steps, but they typically cover similar ground, including:
The various activities of effective research data management play their part in different stages of the model; for instance, writing a data management plan and preparing a dataset for deposit into a data repository occur at different points in the research process. Other practices, such as writing and updating documentation and consistently following an organization scheme, extend across multiple stages.
(UVA Library Research Data Lifecycle model courtesy of Sherry Lake)
We introduce the research data lifecycle to highlight that research data management is a set of activities that span the entire research project, from inception to completion (and potentially beyond). We recommend taking a look at several research data lifecycle models, and while you are going through this guide, considering where each practice/activity is situated in the data lifecycle. We hope that this additional context helps research data management seem like a coherent process of caring for and attending to your data throughout a project rather than merely additional tasks to complete.
Some of the many benefits of practicing good research data management include: