There are many ways to classify data. Two of the more common are:
- Primary and Secondary: Primary data is data that you collect or generate. Secondary data is created by other researchers, and could be their primary data, or the data resulting from their research.
- Qualitative and Quantitative: Qualitative refers to text, images, video, sound recordings, observations, etc. Quantitative refers to numerical data.
Data usually fall into one of five categories:
Observational
- Captured in real-time
- Cannot be reproduced or recaptured. Sometimes called 'unique data'.
- Example include sensor data, human observation, and survey results
Experimental
- Data from lab equipment and under controlled conditions
- Usually reproducible, but expensive to do so
- Examples include gene sequences, chromatograms, spectroscopy.
Simulation
- Data generated from test models studying actual or theoretical systems
- Models and metadata where the input may be of greater importance than the output
- Examples include climate models, economic models, systems engineering.
Derived or compiled
- The results of data analysis, or aggregated from multiple sources
- Reproducible, but very expensive
- Examples include text and data mining, compiled databases, 3D models.
Reference or canonical
- Fixed or organic collection datasets, usually peer-reviewed, and often published and curated.
- Examples include gene sequence databanks, census data, chemical structures.
Data come in many forms. Common ones are text, numeric, audio, models, code, instrument, images, and video.