The world’s four most popular AI chatbots make too many mistakes when reporting news stories, a BBC study has found, with inaccuracies reported in more than half of cases.
Image source: Growtika / unsplash.com
In an experiment, BBC journalists asked chatbots OpenAI ChatGPT, Microsoft Copilot, Google Gemini, and Perplexity to summarize 100 news stories from the agency, then assessed the systems’ responses to determine how accurate they were. The study found that “51% of all AI responses to news-related questions were rated as having some form of significant problem.” Additionally, “19% of AI responses to BBC stories contained factual errors, such as incorrect factual statements, numbers, and dates.”
Google’s Gemini chatbot, in particular, radically distorted a statement from the UK’s National Health Service, while ChatGPT and Copilot continued to consider retired politicians as active. The AI’s careless handling of information is systemic, British journalists point out: it “had difficulty distinguishing between opinions and facts, ranted and often missed important context.” Earlier, it became known that iOS 18.3 temporarily disabled the news summaries function included in the Apple Intelligence package. Not all AI systems performed equally in the study: “Microsoft Copilot and Google Gemini have more significant problems than OpenAI ChatGPT and Perplexity,” the BBC concluded.
The experiment has once again shown that information from AI chatbots should be taken with a grain of salt. AI is developing rapidly, large language models are released almost every week, and errors in such a volume of data are inevitable. On the other hand, “hallucinations,” that is, deliberately incorrect answers, are now less common in advanced systems than before. AI is progressing faster than Moore’s Law suggests, OpenAI CEO Sam Altman recently said in his personal blog. But at the moment, it is still too much to trust chatbots, especially when it comes to news materials.
Nissan Leaf can rightfully be considered a long-liver of the electric car market, since the…
OpenAI, the market leader in generative artificial intelligence systems, remains nominally a startup, its financial…
OpenAI has been forced to delay the release of ChatGPT's built-in image generator for free…
Xiaomi continues to update its Redmi G27Q gaming monitor every year. The model was first…
Android device makers can significantly customize the look and feel of the operating system, but…
In China, scammers have started selling GeForce RTX 3090 graphics cards, passing them off as…