Skip to main content
It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.

Research Data Management

This guide provides best practices and resources for managing your research data for any discipline.

File Formats

File Formats should be chosen to ensure sharing, long-term access and preservation of your data.  Choose open standards and formats that are easy to reuse.  If you are using a different format during the collection and analysis phases of your research, be sure to include information in your documentation about features that may be lost when the files are migrated to their preservation format, as well as any specific software that will be necessary to view or work with the data.  

Best practice for file format selection include:

  • non-proprietary
  • unencrypted
  • uncompressed
  • open, documented standard
  • commonly used by your research community
  • using common character encodings - ASCII, UTF-8

Remember to retain your original unedited raw data in its native formats as your source data.  Do not alter or edit it.  Document the tools, instruments, or software used in its creation.  Make a copy of it prior to any analysis or data manipulations.  

Recommended Digital Data Formats:

Text, Documentation, Scripts: XML, PDF/A, HTML, Plain Text.

Still Image: TIFF, JPEG 2000, PNG, JPEG/JFIF, DNG (digital negative), BMP, GIF.

Geospatial: Shapefile (SHP, DBF, SHX), GeoTIFF, NetCDF.

Graphic Image:

  • raster formats: TIFF, JPEG2000, PNG, JPEG/JFIF, DNG, BMP, GIF.
  • vector formats: Scalable vector graphics, AutoCAD Drawing Interchange Format, Encapsulated Postscripts, Shape files.
  • cartographic: Most complete data, GeoTIFF, GeoPDF, GeoJPEG2000, Shapefile.

Audio: WAVE, AIFF, MP3, MXF, FLAC.

Video: MOV, MPEG-4, AVI, MXF.

Database: XML, CSV, TAB.

Adapted from Library of Congress Recommended Formats Statement and the UK Data Archive  

Version Control

Version Control is the way to track revisions of a data set.  If your research involves more than one person, it is essential.  You will want to record every change to a file, no matter how small.  Keep track of the changes to a file in your file naming convention and log files, or version control software.  File sharing software can also be used to track versions.

You can do it manually by including a version control indicator in the file name, such as v01, v02, v1.4.  The standard convention is to use whole numbers for major revisions, and decimals for minor ones.  

There are several software programs that are designed for managing versions tracking. Mercurial, TortiseSVN, Apache Subversion, Git, and SmartSVN.

File sharing software can also be used to track versions.  UVaBox has options to track both major and minor versions of files. Google Docs records version changes as well.