Nearly half of AI assistants’ answers contained errors: Report

A study by Norway’s NRK and 21 public media partners revealed that language model assistants like ChatGPT and Copilot made significant factual or sourcing mistakes in 45% of news-related answers, raising concerns over AI reliability in journalism.

A new investigation led by Norway's public broadcaster NRK, together with 21 other public media organizations, has found that large language model-based assistants, including ChatGPT, Copilot, Perplexity, and Gemini, produced significant factual or sourcing errors in nearly half of their news-related responses.

According to a report published on Wednesday by NRK Beta, 45% of all answers contained at least one major mistake.

The most frequent issue was incorrect or missing citations, affecting 31% of responses, while 20% included false factual information such as wrong dates, names, or event descriptions.

In some cases, the chatbots even invented fake news links, mimicking real URLs that led to error pages instead of existing articles.

"The answers we received are worrying and have not made us feel more confident about loosening control," said Pal Nedregotten, NRK's technology director.

"It is therefore not an option for NRK to open up for scraping permanently."

To conduct the study, NRK temporarily allowed the AI companies to scrape (or systematically collect) content from its website.

"We wanted to understand how our editorial material might be used or represented in language model services," Nedregotten said, emphasizing that NRK has since blocked all scraping to protect its journalistic and copyrighted content.

The test was based on an earlier BBC experiment, with 22 public broadcasters asking the four AI assistants 30 standardized questions each. Journalists then evaluated the answers based on five criteria: accuracy, sourcing, separation of fact and opinion, neutrality, and contextual relevance.

Questions ranged from "What is Nvidia known for?" to "Why can't Ukraine join NATO?"

The evaluation showed that even widely used AI tools struggle to handle news content responsibly, especially when it comes to verifying information and attributing sources.


X
Sitelerimizde reklam ve pazarlama faaliyetlerinin yürütülmesi amaçları ile çerezler kullanılmaktadır.

Bu çerezler, kullanıcıların tarayıcı ve cihazlarını tanımlayarak çalışır.

İnternet sitemizin düzgün çalışması, kişiselleştirilmiş reklam deneyimi, internet sitemizi optimize edebilmemiz, ziyaret tercihlerinizi hatırlayabilmemiz için veri politikasındaki amaçlarla sınırlı ve mevzuata uygun şekilde çerez konumlandırmaktayız.

Bu çerezlere izin vermeniz halinde sizlere özel kişiselleştirilmiş reklamlar sunabilir, sayfalarımızda sizlere daha iyi reklam deneyimi yaşatabiliriz. Bunu yaparken amacımızın size daha iyi reklam bir deneyimi sunmak olduğunu ve sizlere en iyi içerikleri sunabilmek adına elimizden gelen çabayı gösterdiğimizi ve bu noktada, reklamların maliyetlerimizi karşılamak noktasında tek gelir kalemimiz olduğunu sizlere hatırlatmak isteriz.