Abstract
Objective: It is widely accepted that smoking prevalence and poverty predict the occurrence of lung cancer mortality. The question asked in the study was: What are the important factors for counties that are useful to public health professionals? We sought to provide an answer, using a recursive partitioning approach applied to county-level indicators.
Methods: Classification and regression tree analysis is relatively unexplored for its utility in public health. Using available ecologic data, county lung cancer mortality was modeled by several predictor variables from a larger set of candidates. We constructed a tree on the basis of statistical software, R.
Results: Seven groupings were defined. Not surprisingly, smoking prevalence was a major determiner of tree nodes, as were prior coronary heart disease mortality, poverty, and National Air Toxics Assessment excess cancer deaths estimates. Lung cancer mortality groupings ranged from 47 per 100000 in the best 2 groupings (leaves) to 85 per 100000 in the worst grouping of 52 local jurisdictions.
Conclusions: Ecologic data portrayed in a classification and regression tree have utility for spurring etiologic investigation, tracking county outcomes, developing policy at any governmental level, and guiding program design and management. Community by community, improvements are not yet at Healthy People 2010 targets. Individual communities may benefit through efforts to focus attention on aspects such as smoking levels, poverty, air quality, or region, highlighted by this analysis.