flavor image

Tips and Guide for Data Users

Documentation Files and Format

For any study, there are several possible types of documentation files for Research Connections data collections:

Codebook and Documentation Files

  • Codebook: Information on the structure, contents, and layout of a data file. The codebook may also contain information on study design and methodology.
  • Dictionary file: Information on column locations and labeling of variables
  • Data map: Similar to a dictionary file
  • Errata file: Errors noted for a particular collection, usually supplied by the principal investigator.
  • Frequency file: Frequency of response or descriptive statistics for selected variables in a collection.
  • Cross-tabulation file: Cross-tabulations for some or all variables in a collection
  • User Guide: More detailed information about a particular collection, often provided by the principal investigator
  • Manual: Instructions prepared by the principal investigator on some aspect of the data collection.
  • Appendices: Additional documentation
  • Reports: Description of findings or results based on analysis of a dataset. Prepared by the principal investigator.
  • Record layout file: Similar to a dictionary file.
  • Tables/Crosstables: Similar to frequencies files but presented in tabular format

The standard format for documentation is Portable Document Format (PDF), and we are moving toward compliance with the PDF/A standard. The PDF file format was developed by Adobe Systems Incorporated and can be accessed using PDF reader software, such as the Adobe Acrobat Reader.