Abstract

An experimental study suggests yes, but utility has yet to be tested.

 

Article Content

A research team sought to compare the performance of physicians and artificial intelligence (AI) chatbots in empathetically answering patients' questions, and the results generated headlines. Chatbot responses were judged to be "significantly more empathetic" than those of the physicians participating in the experiment.

 

The study, published in the June JAMA Internal Medicine, raised the specter of AI substituting for direct dialogue between patients and clinicians. But many caveats accompanied the results. Notably, the experiment did not take place under clinical conditions and chatbot answers, drawn from vast stores of programmed information, were significantly longer than what physicians' typically write, given time constraints and the need for brevity in clinical notes.

 

The setting for the experiment was a subforum of the online social media site Reddit called r/AskDocs to which Reddit members post medical questions. The questions are answered by verified health care professional volunteers whose credentials, such as "physician," are posted with each response. Researchers randomly selected 195 posted questions answered by physicians and ran the original questions (minus physician responses) through a chatbot. The two sets of answers were then blindly reviewed by a panel of three health care providers (a master's-prepared nurse and two physicians) to evaluate empathetic content.

 

By a more than three-to-one ratio, the evaluators preferred chatbot responses to physician responses and rated the chatbot responses as significantly more empathetic than the physician responses. Physician responses overall were judged to be 41% less empathetic than chatbot responses, with nearly 10 times as many chatbot responses judged as "empathetic" or "very empathetic."

 

Response length, however, emerged as a significant difference between physician and chatbot answers that may have influenced how the evaluators perceived the responses. Chatbot responses on average were four times longer than physician responses: 168 to 245 words for chatbots compared with just 17 to 62 words for physicians. "The additional length of the chatbot responses could have been erroneously associated with greater empathy," the researchers acknowledged.

 

The chatbot used for the study, ChatGPT, is a natural language processing model developed by OpenAI. Such models are fed enormous amounts of text and "trained" to identify and respond to contextual words. By analyzing a lot of text, the chatbot learns to string together appropriate words to answer queries.

 

A limitation of the study was that neither chatbot nor physician responses were checked for "accuracy or fabricated information," the researchers wrote. Despite this and other limitations, they contend that AI should not be summarily dismissed from clinician-patient communications. Chatbots may be useful to answer routine patient queries and generally assist in explaining such things as sensor-based technologies. Moreover, AI clinical decision support tools are already in use by physicians and nurses.

 

In a commentary on the study, published May 4 in Stat, physician Jennifer Lycette underscored the time constraints on clinicians trying to respond to large volumes of email from patients, colleagues, and support staff. "The real potential of AI in health care will be in offloading non-human-requiring tasks," Lycette wrote.-Jennifer Fink, BSN, RN