flavor image

Tips and Guide for Data Users

Data Files

In this context, a data file is not the analyzed findings of a study or statistics, but the raw collected data from which these statistics might be extrapolated. It usually consists of rows and columns of alphanumeric characters. The majority of Research Connections data files are ASCII fixed-format files. The storage formats of data files may be logical record length format, card image, or delimited format. The physical structure of data files also varies and may be rectangular, hierarchical, or relational. Data collections also include data files in other formats, including SPSS portable files or SAS transport files.

Codebook and Documentation Files

A codebook provides information on the structure, contents, and layout of a data file. Users are strongly encouraged to look at the codebook of a study before downloading the data files. While codebooks vary widely in quality and amount of information given, a typical codebook includes:

  • Column locations and widths for each variable
  • Definitions of different record types
  • Response codes for each variable
  • Codes used to indicate nonresponse and missing data
  • Exact questions and skip patterns used in a survey
  • Other indications of the content and characteristics of each variable

Additionally, codebooks may also contain:

  • Frequencies of response
  • Survey objectives
  • Concept definitions
  • A description of the survey design and methodology
  • A copy of the survey questionnaire (if applicable)
  • Information on data collection, data processing, and data quality

Data File Formats

We distribute data in the following formats:

  • ASCII
  • SPSS Portable
  • SAS XPORT Library and Transport Files
  • Stata DTA

ASCII
All Research Connections data files are distributed as undelimited, or columnar, ASCII files. An ASCII data file consists of alphanumeric characters. ASCII data files are text files; as such, they can be opened in any word processing program, text editor, or Internet browser. The alphanumeric characters, however, are not meaningful without the help of a codebook or setup file, which will identify the columns of an ASCII data file as particular variables. In order to analyze the data, the ASCII file should be imported into a statistical, database, or spreadsheet software package.

SPSS Portable
All Research Connections data files are distributed as SPSS portable files that are not specific to a particular SPSS version or computer platform. These files can be opened directly in SPSS statistical software without the use of setup files, by using the SPSS 'IMPORT' command.

SAS Xport Library and Transport Files
All Research Connections data files are distributed as SAS Xport library and transport files that are not specific to a computer platform. These files can be opened directly in SAS statistical software without the use of setup files, by using PROC COPY with a libname statement that properly defines the XPORT Library file or by using PROC CIMPORT with the transport file.

Stata DTA
Some Research Connections data files are distributed as Stata DTA files that are not specific to a computer platform. These files can be opened directly in Stata statistical software without the use of setup files.