01 May ChatGPT Provided Surprising Empathetic Answers to Patient Queries
MedicalResearch.com Interview with:
Zechariah Zhu, B.S.
Affiliate Scientist with the Qualcomm Institute at UC San Diego and study co-author
First author: John W. Ayers, PhD, MA
MedicalResearch.com: What is the background for this study?
Response: In today’s day and age (especially after the COVID-19 pandemic), an increasing number of people are turning to virtual options for healthcare. Most notably, there was a 1.6-fold increase in electronic patient messages, which significantly increased the burden on physicians, with a record-high proportion of physicians (62%) reporting burnout symptoms.
On the other hand, we also see the rise of AI technologies like ChatGPT—an AI chatbot assistant that has taken the world by storm recently with its ability to provide lengthy response essays to many questions it is asked. Our objective for this study, then, was to evaluate the ability of ChatGPT to provide quality and empathetic responses to patient questions.
MedicalResearch.com: How is the data entered into the Chatbot collected and updated?
Response: We collected the publicly available data from r/AskDocs, an online forum on Reddit (also known as a “subreddit”) where users can post medical questions and verified healthcare professional volunteers can submit responses to those questions. The moderators of the subreddit make sure to verify the credentials of the healthcare professionals that respond to the questions.
We gathered a random sample of 195 pairs of question-response exchanges from the subreddit (i.e., a user asks a question, and a verified healthcare professional has already given a response). We then took each of those 195 user-asked questions, and pasted the same question (verbatim) into a new, history-cleared, ChatGPT session, and prompted ChatGPT for a response. This generated 195 ChatGPT responses, one for each of the user-asked questions. We then compared those ChatGPT responses to the healthcare professional responses, using a team of our own licensed healthcare professionals to evaluate for the quality and empathy of each response. The responses were randomly ordered and our evaluators were blinded to the identity of the author (ChatGPT vs. human).
MedicalResearch.com: What are the main findings?
Response: We found that the responses that ChatGPT provided were generally rated highly (compared to responses provided by verified physicians) by a team of licensed healthcare professionals. Out of the 195 randomly selected exchanges from the subreddit, our evaluators preferred the chatbot’s response 79% of the time, and they also rated the bot’s responses, on average, to be of higher quality and empathy than the human responses.
MedicalResearch.com: What should readers take away from your report?
Response: I think readers should be optimistic about the fact that ChatGPT and its new generation of artificial intelligence technologies have already improved to a level such that it can provide quality and empathetic responses to patient questions. This brings huge potential to the healthcare sector, as it can ease the ever-increasing burden on our physicians, while also supplementing physician responses to patient questions as well. Our results show promise that AI assistants have the potential to improve both clinician and patient outcomes.
At the same time, they should be aware that the report does not claim that ChatGPT should fully replace doctors at this point in time. There are some limitations to the study, mainly the fact that the patient questions were asked on a subreddit and not in a clinical setting
MedicalResearch.com: What recommendations do you have for future research as a results of this study?
Response: We hope this study will motivate further research into using AI assistant messaging, as messaging will definitely continue to be a more and more popular avenue of communication in healthcare. Future research into how these AI chatbots might perform in clinical settings is highly recommended, before any definitive conclusions can be made regarding how effective they are outside of merely the messaging space. But there is a strong potential to have AI assisted messaging improve the health of millions of Americans who are sending messages now that go unanswered.
MedicalResearch.com: Is there anything else you would like to add? Any disclosures?
Response: There is some chatter online about how we did not assess accuracy, only quality and empathy. However, this is based on a misreading of our study limitations. Our team of healthcare professionals evaluated the “quality of the information” provided in the responses from very poor to very good. Of course, this overall quality included accuracy (e.g., a less accurate response would be judged more poorly) but also included additional attributes, such as comprehensiveness, understandability, etc. when two responses were equally accurate. As a result, it is fair to say we did not analyze any disaggregation of quality which would include accuracy and many additional constructs.
I have no conflicts to disclose.
Citation:
Ayers JW, Poliak A, Dredze M, et al. Comparing Physician and Artificial Intelligence Chatbot Responses to Patient Questions Posted to a Public Social Media Forum. JAMA Intern Med. Published online April 28, 2023. doi:10.1001/jamainternmed.2023.1838
https://jamanetwork.com/journals/jamainternalmedicine/fullarticle/2804309
The information on MedicalResearch.com is provided for educational purposes only, and is in no way intended to diagnose, cure, or treat any medical or other condition. Always seek the advice of your physician or other qualified health and ask your doctor any questions you may have regarding a medical condition. In addition to all other limitations and disclaimers in this agreement, service provider and its third party providers disclaim any liability or loss in connection with the content provided on this website.