Keywords

data analysis, healthcare surveys, methods

 

Authors

  1. Saylor, Jennifer
  2. Friedmann, Erika
  3. Lee, Hyeon Joo

Abstract

Background: The National Center for Health Statistics conducts the National Health and Nutrition Examination Survey and other national surveys with probability-based complex sample designs. Goals of national surveys are to provide valid data for the population of the United States. Analyses of data from population surveys present unique challenges in the research process but are valuable avenues to study the health of the United States population.

 

Objective: The aim of this study was to demonstrate the importance of using complex data analysis techniques for data obtained with complex multistage sampling design and provide an example of analysis using the SPSS Complex Samples procedure.

 

Methods: Illustration of challenges and solutions specific to secondary data analysis of national databases are described using the National Health and Nutrition Examination Survey as the exemplar.

 

Results: Oversampling of small or sensitive groups provides necessary estimates of variability within small groups. Use of weights without complex samples accurately estimates population means and frequency from the sample after accounting for over- or undersampling of specific groups. Weighting alone leads to inappropriate population estimates of variability, because they are computed as if the measures were from the entire population rather than a sample in the data set. The SPSS Complex Samples procedure allows inclusion of all sampling design elements, stratification, clusters, and weights.

 

Discussion: Use of national data sets allows use of extensive, expensive, and well-documented survey data for exploratory questions but limits analysis to those variables included in the data set. The large sample permits examination of multiple predictors and interactive relationships. Merging data files, availability of data in several waves of surveys, and complex sampling are techniques used to provide a representative sample but present unique challenges. In sophisticated data analysis techniques, use of these data is optimized.