Keywords

data integrity, data quality, online survey

 

Authors

  1. Walker, Lorraine O.
  2. Murry, Nicole
  3. Longoria, Kayla D.

Abstract

Background: Online surveys have proven to be an efficient method to gather health information in studies of various populations, but these are accompanied by threats to data integrity and quality. We draw on our experience with a nefarious intrusion into an online survey and our efforts to protect data integrity and quality in a subsequent online survey.

 

Objectives: We aim to share lessons learned regarding detecting and preventing threats to online survey data integrity and quality.

 

Methods: We examined data from two online surveys we conducted, as well as findings of others reported in the literature, to delineate threats to and prevention strategies for online health surveys.

 

Results: Our first survey was launched inadvertently without available security features engaged in Qualtrics, resulting in a number of threats to data integrity and quality. These threats included multiple submissions, often within seconds of each other, from the same internet protocol (IP) address; use of proxy servers or virtual private networks, often with suspicious or abusive IP address ratings and geolocations outside the United States; and incoherent text data or otherwise suspicious responses. After excluding fraudulent, suspicious, or ineligible cases, as well as cases that terminated before submitting data, 102 of 224 (45.5%) eligible survey respondents remained with partial or complete data. In a second online survey with security features in Qualtrics engaged, no IP addresses were associated with any duplicate submissions. To further protect data integrity and quality, we added items to detect inattentive or fraudulent respondents and applied a risk scoring system in which 23 survey respondents were high risk, 16 were moderate risk, and 289 of 464 (62.3%) were low or no risk and therefore considered eligible respondents.

 

Discussion: Technological safeguards, such as blocking repeat IP addresses and study design features to detect inattentive or fraudulent respondents, are strategies to support data integrity and quality in online survey research. For online data collection to make meaningful contributions to nursing research, it is important for nursing scientists to implement technological, study design, and methodological safeguards to protect data integrity and quality and for future research to focus on advancing data protection methodologies.