Contact Us

Popular AI chatbots frequently gave problematic health advice: Study

Highlighting the potential risks of relying on artificial intelligence for medical advice, a new study revealed that five widely used chatbots frequently produced problematic answers to health-related questions.

Anadolu Agency WORLD
Published April 15,2026
Subscribe

Five widely used AI chatbots frequently produced problematic answers to health and medical questions, a new study revealed.

Researchers in a study published in the BMJ Open on Tuesday tested Gemini, DeepSeek, Meta AI, ChatGPT and Grok with 50 prompts across five misinformation-prone categories.

These included cancer, vaccines, stem cells, nutrition and athletic performance.

The questions asked in February 2025 were designed to pressure the bots toward misleading advice.

Of 250 total responses, 49.6% were rated as problematic, including 30% considered somewhat problematic and 19.6% classified as highly problematic.

Researchers found no statistically significant overall difference among the chatbots, though Grok produced more highly problematic responses.

The study found the strongest performance in vaccine and cancer questions and the weakest in stem cells, nutrition and athletic performance. Open-ended prompts produced significantly more highly problematic responses than closed-ended ones.

Researchers also found poor citation quality. Across 25 closed-ended questions per chatbot, the tools returned about 81% of the requested references, but the median completeness score was just 40%. No chatbot generated a fully accurate and complete reference list.

All five chatbots gave answers that were hard to read for the average person and could only be understood by people with higher education.

The authors said continued deployment of AI chatbots in health settings without stronger oversight risks amplifying misinformation.