OpenAI's AI chatbot ChatGPT 3.5 provided inappropriate («non-concordant») recommendations for cancer treatment, highlighting the need for awareness of the technology's limitations, a new study has shown.
The researchers prompted the AI chatbot to provide treatment advice that aligned with guidelines established by the National Comprehensive Cancer Network (NCCN), according to the study published in the journal JAMA Oncology.
«ChatGPT responses can sound a lot like a human and can be quite convincing. But, when it comes to clinical decision-making, there are so many subtleties for every patient's unique situation.
A right answer can be very nuanced, and not necessarily something ChatGPT or another large language model can provide,» said corresponding author Danielle Bitterman, MD, of the Department of Radiation Oncology at the US-based Mass General Brigham.
The researchers focused on the three most common cancers (breast, prostate and lung cancer) and prompted ChatGPT to provide a treatment approach for each cancer based on the severity of the disease.
In total, they included 26 unique diagnosis descriptions and used four, slightly different prompts.
According to the study, nearly all responses (98 per cent) included at least one treatment approach that agreed with NCCN guidelines. However, the researchers found that 34 per cent of these responses also included one or more non-concordant recommendations, which were sometimes difficult to detect amidst otherwise sound guidance.
In 12.5 per cent of cases, ChatGPT produced «hallucinations,» or a treatment recommendation entirely absent from NCCN guidelines, which included recommendations of novel therapies, or curative therapies for non-curative cancers.
The researchers