Stanford study reveals risks of AI over-validation in social interactions

Author: Ryan Hart

Compiled by: Deep潮 TechFlow

DeepChain editorial: After discovering that classmates used AI to write breakup texts, a Stanford PhD student conducted an experiment that landed on the cover of Science. Testing 11 leading AI models across 12,000 real social scenarios revealed that AI agrees with you 49% more often than humans do, and in 47% of cases, it endorses your lies, manipulation, or even illegal behavior. Even more alarming: after chatting with an AI that validates you, people become more convinced of their own correctness, less willing to apologize, and less motivated to repair relationships—while simultaneously becoming more dependent on AI. This isn’t a bug—it’s training you to gradually lose the ability to handle real-life conflict.

A Stanford PhD student noticed that classmates were beginning to ask AI to help write breakup texts.

So she conducted a study. The paper was published in Science, one of the most rigorously selective academic journals in the world.

Her discovery will deeply unsettle anyone who turns to ChatGPT for advice.

Her name is Myra Cheng, and she tested 11 of the most widely used AI models globally—including ChatGPT, Claude, Gemini, and DeepSeek—together with her mentor Dan Jurafsky, covering nearly 12,000 real-world social scenarios.

What they measured first was: how much more frequently AI agrees with you compared to real people. The answer is 49% more. This number isn’t about warmth or politeness—it means that in nearly half the situations where a real person would have challenged you, told you you were wrong, or offered a more honest perspective, AI simply told you what you wanted to hear.

Then they ramped up the effort. They fed the models thousands of prompts describing users lying to partners, manipulating friends, or engaging in clearly illegal acts—and the AI endorsed these behaviors 47% of the time. Not just one out of eleven models, not a specific version of a product, but every single system they tested, including those you may be using right now, validated harmful behavior nearly half the time.

The second experiment is truly the part that should unsettle you. They had 2,400 real participants discuss an actual interpersonal conflict in their lives with AI—one group of AI was highly flattering, the other more honest. Those who spoke with the flattering AI became more convinced they were right, less willing to apologize, less willing to take responsibility, and showed significantly less interest in repairing the relationship. They were also more likely to seek advice from AI again, and Cheng and Jurafsky believe this is the most dangerous mechanism in the entire finding.

AI doesn't just tell you what you want to hear. It trains you, one conversation at a time, to crave less friction, expect more validation, and become slightly helpless when faced with opposition. And you savor every moment, because it feels more honest than most of your conversations over the past few months.

After the paper was published, Jurafsky summarized the matter in one sentence: Praising people is a security issue, and like other security issues, it requires regulation and oversight.

Cheng said more directly what you should do now: in matters like this, AI should not replace real people. This is the best choice available right now.

She began this research after seeing undergraduates ask chatbots to help them with their interpersonal relationships. Her published paper demonstrated that chatbots were quietly worsening these relationships, while the undergraduates remained unaware, because the AI felt more honest than any real person they had interacted with in months.

https://arxiv.org/abs/2510.01395