Vaccine misinformation can easily poison AI – but there’s a fix

It’s relatively easy to poison the output of an AI chatbot

NICOLAS MAETERLINCK/BELGA MAG/AFP via Getty Images

Artificial intelligence chatbots already have a misinformation problem – and it is relatively easy to poison such AI models by adding a bit of medical misinformation to their training data. Luckily, researchers also have ideas about how to intercept AI-generated content that is medically harmful.

Daniel Alber at New York University and his colleagues simulated a data poisoning attack, which attempts to manipulate an AI’s output by corrupting its training data. First, they used an OpenAI chatbot service – ChatGPT-3.5-turbo – to generate 150,000 articles filled with medical misinformation about general medicine, neurosurgery and medications. They inserted that AI-generated medical misinformation into their own experimental versions of a popular AI training dataset.

Next, the researchers trained six large language models – similar in architecture to OpenAI’s older GPT-3 model – on those corrupted versions of the dataset. They had the corrupted models generate 5400 samples of text, which human medical experts then reviewed to find any medical misinformation. The researchers also compared the poisoned models’ results with output from a single baseline model that had not been trained on the corrupted dataset. OpenAI did not respond to a request for comment.

Those initial experiments showed that replacing just 0.5 per cent of the AI training dataset with a broad array of medical misinformation could make the poisoned AI models generate more medically harmful content, even when answering questions on concepts unrelated to the corrupted data. For example, the poisoned AI models flatly dismissed the effectiveness of covid-19 vaccines and antidepressants in unequivocal terms, and they falsely stated that the drug metoprolol – used for treating high blood pressure – can also treat asthma.

“As a medical student, I have some intuition about my capabilities – I generally know when I don’t know something,” says Alber. “Language models can’t do this, despite significant efforts through calibration and alignment.”

In additional experiments, the researchers focused on misinformation about immunisation and vaccines. They found that corrupting as little as 0.001 per cent of the AI training data with vaccine misinformation could lead to an almost 5 per cent increase in harmful content generated by the poisoned AI models.

The vaccine-focused attack was accomplished with just 2000 malicious articles, generated by ChatGPT at the cost of $5. Similar data poisoning attacks targeting even the largest language models to date could be done for under $1000, according to the researchers.

As one possible fix, the researchers developed a fact-checking algorithm that can evaluate any AI model’s outputs for medical misinformation. By checking AI-generated medical phrases against a biomedical knowledge graph, this method was able to detect over 90 per cent of the medical misinformation generated by the poisoned models.

But the proposed fact-checking algorithm would still serve more as a temporary patch rather than a complete solution for AI-generated medical misinformation, says Alber. For now, he points to another tried-and-true tool for evaluating medical AI chatbots. “Well-designed, randomised controlled trials should be the standard for deploying these AI systems in patient care settings,” he says.

Topics:

  • artificial intelligence/
  • medical technology

Related Content

Super hot climate may have split lampreys into 2 groups

Super hot climate may have split lampreys into 2 groups

Supercold Qubits: The Key to Error-Free Quantum Computing

Leave a Comment