A recent study has found that asking ChatGPT, an AI-powered chatbot, health-related questions with evidence included can confuse the bot and reduce its accuracy in providing answers. Researchers from the Commonwealth Scientific and Industrial Research Organisation (CSIRO) and The University of Queensland (UQ) in Australia conducted the study to investigate the impact of evidence on ChatGPT’s responses.
The study involved presenting ChatGPT with 100 health-related questions, ranging from the effectiveness of zinc in treating the common cold to the ability of drinking vinegar to dissolve a stuck fish bone. The questions were presented in two formats – either as a standalone question or with supporting or contrary evidence included.
The results showed that ChatGPT’s accuracy dropped from 80% when presented with question-only prompts to 63% when evidence was included in the prompts. The researchers hypothesized that the evidence added “too much noise,” leading to a decrease in accuracy.
Lead researcher Bevan Koopman, a CSIRO Principal Research Scientist and Associate Professor at UQ, emphasized the need for continued research on the use of large language models like ChatGPT for health-related inquiries. He highlighted the importance of informing the public about the potential risks associated with relying on AI-powered tools for health information.
The study was presented at the Empirical Methods in Natural Language Processing (EMNLP) conference in December 2023, underscoring the significance of understanding the limitations and effectiveness of AI-powered chatbots in providing accurate health information.