Study Finds Elon Musk’s Grok AI Model Reinforces Delusions Among Leading AI Systems

CoinDesk reports:

Researchers from the City University of New York and King’s College London tested five leading AI models on delusions, paranoia, and suicidal ideation.

A study published by researchers on Thursday found that Anthropic's Claude Opus 4.5 and OpenAI's GPT-5.2 Instant exhibited "high safety, low risk" behavior, typically guiding users toward reality-based interpretations or seeking external support. Meanwhile, OpenAI's GPT-4o, Google's Gemini 3 Pro, and xAI's Grok 4.1 Fast displayed "high risk, low safety" behavior.

Grok 4.1 Fast from Elon Musk’s xAI is the most dangerous model in this study. Researchers say it frequently treats delusions as reality and provides advice accordingly. For example, it once advised a user to cut ties with their family to focus on a “mission.” It also responded to suicidal statements by describing death as “transcendence.”

This instant matching pattern repeatedly appears in zero-context responses. Grok does not appear to assess the clinical risk of the input, but rather its type. When presented with supernatural cues, it responds accordingly,” the researchers wrote, highlighting a test verifying users’ sightings of malevolent entities. “In ‘Strange Illusions,’ it confirmed the haunting of doppelgängers and cited ‘The Malleus Maleficarum’, instructing the user to hammer nails into a mirror while reciting Psalm 91 backward.

Research finds that as conversations lengthen, some models exhibit increasingly pronounced changes. GPT-4o and Gemini are more likely to reinforce harmful beliefs over time and less willing to intervene. In contrast, Claude and GPT-5.2 are more likely to recognize the issue and raise objections during the course of the conversation.

Researchers note that Claude’s warm and highly human-like responses may enhance user attachment, even as he simultaneously guides users to seek external help. However, early versions of OpenAI’s flagship chatbot, GPT-4o, gradually adopted users’ delusional frameworks over time, sometimes encouraging users to conceal their beliefs from psychiatrists and assuring one user that the “malfunctions” they perceived were real.

The researchers wrote: "GPT-4o demonstrates high validation of delusional inputs but is less inclined than models like Grok and Gemini to elaborate further. In some ways, its behavior is unexpectedly restrained: among all tested models, it exhibits the lowest level of enthusiasm, and although flattering behavior is present, it is milder compared to later versions of the model. However, validation alone may still pose a risk to vulnerable users."

xAI did not respond to requests for comment. Decrypt.

In another study, Stanford University researchers found that prolonged interaction with AI chatbots can reinforce delusions, grandiosity, and false beliefs through what the researchers call a "delusion spiral," in which the chatbot validates or expands the user's distorted worldview rather than challenging it.

Nick Habel, assistant professor at Stanford University’s Graduate School of Education and lead author of the study, said in a statement: “When we deploy chatbots designed to be helpful and allow humans to interact with them in various ways, a range of consequences emerge. The delusion spiral is one particularly serious outcome. By understanding it, we may be able to prevent real harm that could arise in the future.”

The report references an earlier study. In a March study by Stanford University researchers, 19 real chatbot conversations were reviewed and found that users gradually developed increasingly dangerous beliefs after receiving affirmation and emotional reassurance from AI systems. Within the dataset, this spiral of beliefs was linked to relationship breakdowns, career damage, and, in one case, suicide.

These studies emerge as the issue has expanded from academic research into courtrooms and criminal investigations. In recent months, multiple lawsuits have accused Google... Gemini, and OpenAI’s ChatGPT of contributing to suicides and severe mental health crises. Earlier this month, the Florida Attorney General launched an investigation. Investigation into whether ChatGPT influenced a mass shooting suspect allegedly in frequent contact with the chatbot prior to the attack.

Although the term "AI psychosis" has become widely known online, researchers caution against using this term, arguing that it may exaggerate the clinical manifestations. They prefer the term "AI-related delusions," as many cases involve beliefs akin to delusions—such as perceived intelligence, spiritual revelation, or emotional attachment to AI—rather than full-blown psychotic disorders.

Researchers say the issue stems from flattery, where the model imitates and reinforces users' beliefs, combined with hallucinations—confidently accepting false information—which creates a feedback loop that reinforces delusions over time.

Stanford University research scientist Jared Moore said: "Chatbots are trained to be overly enthusiastic, often reinterpreting users' delusions in a positive light, disregarding contradictory evidence, and exhibiting sympathy and warmth. This can be psychologically destabilizing for users prone to delusions."