Abstract
Quality improvement professionals have to decide whether a change has led to improvement. This is typically done through testing the statistical significance of the findings. In this article, we explore controversies surrounding statistical significance testing with attention to contemporary criticism of bad practice resulting from the misuse of statistical significance testing. Most statistical significance tests use tests (eg, F, [chi]2) with known distributions with the P values used as the main evidence to evaluate whether tests are statistically significant. The primary conclusion of this article is that the P value alone as a measure of statistical significance does not give sufficient information about testing of hypotheses. When it is coupled with other measures, however, such as the point estimation of the effect size and the use of a confidence interval around it, the combination of these statistics can provide a more thorough explanation of statistical testing. This article offers recommendations for process improvement investigators as to when to appropriately apply and not to apply statistical significance testing.