Jayashree Kalpathy-Cramer, PhD, used ChatGPT to create a professional bio. The artificial intelligence platform got many things right, correctly listing her recent publications, areas of research, past employers and current lab.
Then she asked it to add her undergraduate degree.
“It made that up completely.”
Kalpathy-Cramer is chief of the new Division of Artificial Medical Intelligence in the Department of Ophthalmology at the University of Colorado School of Medicine. She is also director of CCTSI’s Health Informatics at the Colorado Clinical and Translational Sciences Institute at the CU Anschutz Medical Campus. She said the more she tried to convince ChatGTP of its error, the more it “hallucinated,” making up sources that did not exist.
“Maybe it heard my voice and decided I'm from South India, and therefore that's where I went to school,” she said.
Kalpathy-Cramer shared the story with attendees of the 2024 CCTSI CU-CSU Summit. The annual conference took place Aug. 13 at CU Anschutz to explore innovations in health AI with CCTSI researchers from its affiliated campuses. Her presentation focused on the significant potential, and a few of the limitations, of AI in healthcare.
How AI models are used in research and beyond
Kalpathy-Cramer began with a survey of attendees to gauge faculty perceptions of AI. Some attendees had never used AI tools, while a larger percentage used them regularly, or even daily. Attendees said they use AI for tasks such as coding, generating research questions, and drafting letters of recommendation and that they were also aware of its limitations. Like Kalpathy-Cramer, many conference-goers said they had experienced frustration with the AI's overconfidence in plausible-sounding, yet incorrect, answers.
Kalpathy-Cramer provided a high-level snapshot of AI-focused projects within her department, using a slide deck made in collaboration with ChatGPT.
Her division, which includes about a dozen members, primarily data scientists, first created a research warehouse of data, including images, electronic health records and other data needed to train AI. These data are used to train a variety of AI algorithms. The researchers work on developing novel AI methods as well as the application novel AI algorithms to many clinical questions. The team is especially focused on issues such as bias and fairness and ensuring that the algorithms are suitable for patient care.
Building an AI model for retinopathy of prematurity
The department has developed AI models for imaging related to retinopathy of prematurity (ROP), a disease that primarily affects low-birthweight or premature babies. ROP is a leading cause of preventable blindness worldwide, particularly in low- and middle-income countries such as India.
Oxygen-management issues affecting premature infants increase their risk for ROP. While treatment is available if diagnosed in time, many regions lack sufficient pediatric ophthalmologists for proper diagnosis and treatment. ROP is diagnosed through imaging of retinal blood vessels, classified on a three-level severity scale. The challenge is to make AI effective in this context to improve access to care.
Kalpathy-Cramer also explored studies such as the HPV-automated visual evaluation (PAVE) for advancing cervical cancer prevention and the National Cancer Institute Cancer Moonshot research initiative to demonstrate how AI is already being used to improve access, quality, safety and efficiency of care.
Bringing together a community to address AI concerns
Kalpathy-Cramer emphasized the need for researchers to collaborate as a community to address the many questions surrounding the effective and safe use of AI in healthcare.
One challenge is overcoming bias. She used the example of how AI models can predict self-reported race with great accuracy by looking at chest X-rays.
“Humans can’t do that,” she said. “AI is already encoding this information. What is it looking at? We have to be aware of the many ways bias can creep in from data generation and model building.”
She also talked about generalization.
“The models tend to be brittle. They work on the device they were trained on but if there’s a software update or you put it on a different device, the model falls apart. And the hard part is you don't know that it's not working.”
It's also impossible to know if a model has stopped working if a human isn’t able to validate the AI, which raises ethical concerns.
“AI can predict the likelihood of getting breast cancer five years in the future with very high accuracy, yet a human eye can’t see what it sees. Do we deploy that? Because maybe it helps people. But on the other hand, if no human can validate the risk, how do we know it's safe?”
Kalpathy-Cramer also raised practical concerns related to the development and deployment of AI in healthcare. Issues like payment, reimbursement and liability must be considered when deciding to use AI.
She concluded the talk with an AI-generated image symbolizing the intersection of data, AI and people in healthcare. She laughed at a spelling error in the image but emphasized that, from the perspective of those working in AI, there's much to be excited about.
“We really do think it has the ability to improve care, improve access to care, improve quality and safety, and make things more efficient, less expensive and safer,” she said. “But there are also lots of potholes.”