Abstract
Data science skills are increasingly needed by informatics nurses and nurse scientists, but techniques such as machine learning can be daunting for those with clinical, rather than computer science or technical, backgrounds. With the increasing quantity of publicly available population-level datasets, identification of factors that predict clinical outcomes is possible using machine learning algorithms. This study demonstrates how to apply a machine learning approach to nursing-relevant questions, specifically an approach to predict falls among community-dwelling older adults, based on data from the 2014 Behavioral Risk Factor Surveillance System. A random forest algorithm, a common approach to machine learning, was compared to a logistic regression model. Explanations of how to interpret the models and their associated performance characteristics are included to serve as a tutorial to readers. Machine learning methods constitute an increasingly important approach for nursing as population-level data are increasingly being made available to the public.