Artificial Intelligence (AI) has been instrumental in transforming various sectors of industry and society. However, with great power comes great responsibility, and AI is no exception. There’s a peculiar phenomenon associated with AI, particularly Large Language Models (LLMs) like OpenAI’s ChatGPT, that has been causing quite a stir in the tech and business spheres. This phenomenon is often referred to as ‘hallucination’.
The Hallucination Phenomenon
Despite the enormous capabilities of AI, it has a peculiar tendency to generate information that doesn’t exist, or in simple words, to ‘hallucinate’. These hallucinations range from benignly odd to seriously problematic. For instance, ChatGPT once erroneously asserted that the Golden Gate Bridge was transported across Egypt in 2016. This is a simple mistake, and while it may be humorous, it’s indicative of an issue at the core of these models.
In a more serious instance, an Australian mayor threatened legal action against OpenAI when ChatGPT falsely claimed he had pleaded guilty in a high-profile bribery scandal. This misinformation not only has the potential to tarnish reputations but also raises legal and ethical concerns.
Researchers have also discovered that these AI-induced hallucinations can be exploited maliciously. Hackers can manipulate LLMs to disseminate harmful code packages to unknowing software developers. Furthermore, these models have been found to provide incorrect medical and mental health advice, such as falsely suggesting that wine consumption can prevent cancer.
Understanding the Training Process of Models
To comprehend why hallucinations occur, we must delve into how AI models are developed and trained. Generative AI models, including LLMs, essentially function as complex statistical systems that predict data, be it words, images, music, or speech. They lack genuine intelligence, learning from countless examples typically sourced from the public web.
For instance, if an AI model is presented with the phrase “Looking forward…” from an email, the AI might complete it with “… to hearing back” based on the pattern it has learned from countless similar emails. However, it’s important to remember that the AI doesn’t truly understand the sentiment of ‘looking forward’ to something.
Sebastian Berns, a PhD researcher at Queen Mary University of London, explains that the current LLM training framework involves ‘masking’ previous words for context and then predicting which words should replace the concealed ones. This concept is similar to predictive text in iOS, where we continually press one of the suggested next words.
While this probability-based approach generally works well, it’s not flawless. Due to the vast range of words and their probabilities, LLMs can generate grammatically correct but nonsensical text. They can spread inaccuracies present in their training data or mix different information sources, even those that contradict each other.
The Inherent Challenges with AI Models
The issue with hallucination in AI models is not borne from malicious intent. These models don’t possess the capability for malice, and concepts of truth and falsehood are meaningless to them. They’ve learned to associate certain words or phrases with certain concepts, even if those associations aren’t accurate.
“Hallucinations are tied to an LLM’s inability to estimate the uncertainty of its own prediction,” Berns explains. “An LLM is typically trained to always produce an output, even when the input significantly deviates from the training data. A standard LLM doesn’t have a method to determine if it’s capable of reliably answering a query or making a prediction.”
The Quest to Tackle Hallucinations
The challenge that lies ahead is whether hallucinations in AI models can be ‘solved’, and the answer to this is dependent on our understanding of ‘solved’.
Vu Ha, an applied researcher and engineer at the Allen Institute for Artificial Intelligence, maintains that LLMs “do and will always hallucinate”. However, he also believes that there are tangible ways to reduce hallucinations, depending on how an LLM is trained and deployed.
For instance, a question-answering system can be engineered to have high accuracy by curating a high-quality knowledge base of questions and answers, and connecting this knowledge base with an LLM to provide accurate answers via a retrieval-like process.
Ha uses the example of running the question “Who are the authors of the Toolformer paper?” (Toolformer is an AI model trained by Meta) through Microsoft’s LLM-powered Bing Chat and Google’s Bard. Bing Chat correctly listed all eight Meta co-authors, while Bard incorrectly attributed the paper to researchers at Google and Hugging Face.
“Any deployed LLM-based system will hallucinate. The real question is if the benefits outweigh the negative outcome caused by hallucination,” Ha said. In other words, if there’s no obvious harm done by a model that occasionally gets a date or name wrong but is generally useful, it might be worth the trade-off.
Berns highlights another technique that has been used to reduce hallucinations in LLMs: reinforcement learning from human feedback (RLHF). Introduced by OpenAI in 2017, RLHF involves training an LLM, gathering additional information to train a “reward” model, and fine-tuning the LLM with the reward model via reinforcement learning.
Despite the effectiveness of RLHF, it has its limitations. “I believe the space of possibilities is too large to fully ‘align’ LLMs with RLHF,” warns Berns.
Exploring Alternate Philosophies
If hallucination in AI models can’t be fully solved with current technologies, is it necessarily a bad thing? Berns doesn’t think so. In fact, he suggests that hallucinating models could act as a “co-creative partner”, providing outputs that may not be entirely factual but contain useful threads to explore.
“Hallucinations are a problem if generated statements are factually incorrect or violate any general human, social or specific cultural values,” Berns explains. “But in creative or artistic tasks, the ability to come up with unexpected outputs can be valuable.”
Ha argues that we are holding LLMs to an unreasonable standard. After all, humans also “hallucinate” when we misremember or misrepresent the truth. However, with LLMs, we experience cognitive dissonance because the models produce outputs that look good on the surface but contain errors upon further inspection.