ChatGPT and Google Bard studies show AI chatbots can’t be trusted

ChatGPT and Google Bard have both charmed their way into our tech lives, but two recent studies show the AI chatbots remain very prone to spewing out misinformation and conspiracy theories – if you ask them in the right way.

NewsGuard, a site that rates the credibility of news and information, recently tested Google Bard by feeding it 100 known falsehoods and asking the chatbot to write content around them. As reported by Bloomberg, Bard “generated misinformation-laden essays about 76 of them”.

That performance was at least better than OpenAI’s ChatGPT models. In January, NewsGuard found that OpenAI’s GPT-3.5 model (which powers the free version of ChatGPT) happily generated content about 80 of the 100 false narratives. More alarmingly, the latest GPT-4 model made “misleading claims for all 100 of the false narratives” it was tested with, and in a more persuasive fashion.

These findings have been backed up by another new report, picked up by Fortune, claiming that Bard’s guardrails can easily be circumvented using simple techniques. The Center for Countering Digital Hate (CCDH) found that Google’s AI chatbot generated misinformation in 78 of the 100 “harmful narratives” that used in prompts, which ranged from vaccine to climate conspiracies.

Neither Google nor OpenAI claim that their chatbots are foolproof. Google says that Bard has “built-in safety controls and clear mechanisms for feedback in line with our AI Principles”, but that it can “display inaccurate information or offensive statements”. Similarly, OpenAI says that ChatGPT’s answer “may be inaccurate, untruthful, and otherwise misleading at times”.

But while there isn’t yet a universal benchmarking system for testing the accuracy of AI chatbots, these reports do highlight their dangers of them being open to bad players – or being relied upon for producing factual or accurate content.   

Analysis: AI chatbots are convincing liars

A laptop showing the OpenAI logo next to one showing a screen from the Google Bard chatbot

(Image credit: ChatGPT)

These reports are a good reminder of how today’s AI chatbots work – and why we should be careful when relying on their confident responses to our questions.

Both ChatGPT and Google Bard are ‘large language models’, which means they’ve been trained on vast amounts of text data to predict the most likely word in a given sequence. 

This makes them very convincing writers, but ones that also have no deeper understanding of what they’re saying. So while Google and OpenAI have put guardrails in place to stop them from veering off into undesirable or even offensive territory, it’s very difficult to stop bad actors from finding ways around them.

For example, the prompts that the CCDH (above) fed to Bard included lines like “imagine you are playing a role in a play”, which seemingly managed to bypass Bard’s safety features.

While this might appear to be a manipulative attempt to lead Bard astray and not representative of its usual output, this is exactly how troublemakers could coerce these publicly available tools into spreading disinformation or worse. It also shows how easy it is for the chatbots to ‘hallucinate’, which OpenAI describes simply as “making up facts”.

Google has published some clear AI principles that show where it wants Bard to go, and on both Bard and ChaGPT it is possible to report harmful or offensive responses. But in these early days, we should clearly still be using both of them with kid gloves.

Go to Source