Mitchell said the summary backed up the claim by citing a chapter in an academic book, written by historians. But the chapter didn’t make the bogus claim — it was only referring to the false theory.
“Google’s AI system is not smart enough to figure out that this citation is not actually backing up the claim,” Mitchell said. “Given how untrustworthy it is, I think this AI Overview feature is very irresponsible and should be taken offline.”
Google said in a statement on Friday that it was taking “swift action” to fix errors — such as the Obama falsehood — that violate its content policies and using that to “develop broader improvements” that are already rolling out. But in most cases, Google claims the system is working the way it should after extensive testing before its public release.
“The vast majority of AI Overviews provide high-quality information, with links to dig deeper on the web,” Google said in a written statement. “Many of the examples we’ve seen have been uncommon queries, and we’ve also seen examples that were doctored or that we couldn’t reproduce.”
It’s hard to reproduce errors made by AI language models — in part because they’re inherently random. They work by predicting what words would best answer the questions asked of them based on the data they’ve been trained on. They’re prone to making things up — a widely studied problem known as hallucination.
Loading
The Associated Press tested Google’s AI feature with several questions and shared some of its responses with subject matter experts. Asked what to do about a snake bite, Google gave an answer that was “impressively thorough,” said Robert Espinoza, a biology professor at the California State University, Northridge, who is also president of the American Society of Ichthyologists and Herpetologists.
But when people go to Google with an emergency question, the chance that an answer the tech company gives them includes a hard-to-notice error is a problem.
“The more you are stressed or hurried or in a rush, the more likely you are to just take that first answer that comes out,” said Emily Bender, a linguistics professor and director of the University of Washington’s Computational Linguistics Laboratory. “And in some cases, those can be life-critical situations.”
That’s not Bender’s only concern — and she has warned Google about them for several years.
When Google researchers in 2021 published a paper called Rethinking search, which proposed using AI language models as “domain experts” that could answer questions authoritatively — much like they are doing now — Bender and colleague Chirag Shah responded with a paper laying out why that was a bad idea.
They warned that such AI systems could perpetuate the racism and sexism found in the huge troves of written data they have been trained on.
“The problem with that kind of misinformation is that we’re swimming in it,” Bender said. “And so people are likely to get their biases confirmed. And it’s harder to spot misinformation when it’s confirming your biases.”
Loading
Another concern was a deeper one — that ceding information retrieval to chatbots was degrading the serendipity of human search for knowledge, literacy about what we see online, and the value of connecting in online forums with other people who are going through the same thing. Those forums and other websites count on Google sending people to them, but Google’s new AI overviews threaten to disrupt the flow of money-making internet traffic.
Google’s rivals have also been closely following the reaction. The search giant has faced pressure for more than a year to deliver more AI features as it competes with ChatGPT-maker OpenAI and upstarts such as Perplexity AI, which aspires to take on Google with its own AI question-and-answer app.
“This seems like this was rushed out by Google,” said Dmitry Shevelenko, Perplexity’s chief business officer. “There’s just a lot of unforced errors in the quality.”
AP, Bloomberg