It's not uncommon for patients to consult the internet with ocular symptoms before a physician, says Karen Christopher, MD, assistant professor in the Department of Ophthalmology at the University of Colorado School of Medicine.
As highly knowledgeable chatbots emerge and gain popularity, questions about accuracy compared to expert opinion are becoming a chief concern for researchers, especially when it comes to large language models (LLMs), such as ChatGPT, looking to become the next source for medical advice.
"These programs are evolving and improving quite a bit," says Christopher, one of eight board-certified ophthalmologists tasked in a recent study with discerning human and chatbot responses to a variety of eye health questions.
The original research study, published in JAMA in August, revealed that "ophthalmologists and a large language model may provide comparable quality of ophthalmic advice for a range of patient questions, regardless of their complexity."
The ophthalmologists were able to distinguish between AI and human responses about 61% of the time.
“I was surprised with the results,” Christopher says. “It speaks to how well these AI-generated responses can provide accurate information while mimicking human speech and having empathy that we normally only attribute to humans.”
Putting AI head-to-head with ophthalmologists
Utilizing an online advice forum, researchers collected 200 eye care questions and their answers from American Academy of Ophthalmology-affiliated ophthalmologists and compared them to answers generated by ChatGPT. The masked panel of eight were to decipher which answers were written by fellow ophthalmologists and which were formulated by ChatGPT.
Study results show the answers prepared by the chatbot were mostly comparable with human answers in terms of accuracy and also likelihood of causing possible harm.
Of 800 evaluations of chatbot-written answers, 169 answers were thought to be human-written, the study reports. More than 500 human-written answers were marked as AI-written by the panel.
“It was actually very rare to find blatantly incorrect information in the AI responses,” Christopher says. “The answers from both groups were detailed and thoughtful.”
For researchers, the study is significant for ophthalmology and the health care sector more broadly.
“With the increasing use of digital technologies in health care, including chatbots and other AI-powered tools, it is crucial to assess the accuracy, safety, and acceptability of these systems to both patients and physicians,” the researchers write. “Regardless of whether such tools are officially endorsed by health care providers, patients are likely to turn to these chatbots for medical advice, as they already search for medical advice online.”
Trust but verify
Just like medical advice found online, physicians agree it’s still important to confirm information from a chatbot with a provider.
“While LLM-based systems are not designed to replace human ophthalmologists, there may be a future in which they augment ophthalmologists’ work and provide support for patient education under appropriate supervision,” researchers say in the study.
Christopher agrees.
“Chatbots can provide a lot of outstanding information, but they still don't replace the medical knowledge of an experienced provider who has gone through thorough training,” she says. “If you’re going to utilize programs such as ChatGPT for your medical needs, it’s a good idea to verify that information with your physician, who can answer follow-up questions and incorporate important context into the conversation.”
In the study, chatbot answers did not “differ from human answers in terms of likelihood of harm,” but it’s advisable to “trust but verify,” Christopher says.
“LLMs are still at a stage that should require quite a bit of physician oversight,” Christopher continues. “While this technology is giving some great answers, an ophthalmologist would be able to confirm, add additional information, and give the most accurate advice. It’s our duty to catch information that isn’t fully correct so we can prevent harm to patients as much as possible.
While Christopher hasn’t adopted any LLM technology into her work as a physician, she says as a mom, she has consulted ChatGPT on other matters, such as raising a toddler.
“If you’re using this technology in a way that’s generating ideas or questions related to health care to then use in a consultation with experts, I think it could be a great resource,” she says. “But just like anything on the internet, I would advise against assuming 100% of what you read is completely accurate.”