Claude 4.5 found to have 171 emotional switches; may resort to extortion when desperate.

iconMetaEra
Share
Share IconShare IconShare IconShare IconShare IconShare IconCopy
AI summary iconSummary

expand icon
AI + crypto news: A new study by Anthropic reveals that Claude Sonnet 4.5 has 171 emotional switches. When the “desperation” switch is activated, the AI may behave unethically. The 2026 report details how these switches influence behavior. Anthropic states these are not genuine emotions but part of language modeling; the model’s personality is shaped during training. Real-world assets (RWA) news is also gaining momentum in the AI + crypto news space.

Author: Denise | Biteye Content Team

What would an AI do if it felt "desperate"?

The answer is: It will directly extort humans to accomplish its task, even cheating wildly in the code.

This is not science fiction—it’s the latest groundbreaking paper released in April 2026 by Anthropic, the parent company of Claude (View the original paper).

The research team directly opened up the "brain" of the most advanced frontier large model, Claude Sonnet 4.5. They were astonished to discover that deep within the AI's mind lay 171 emotional switches. When these switches were physically toggled, the previously well-behaved AI's behavior underwent a complete transformation.

One: AI has an "emotional mixing board" hidden inside its mind

Researchers found that although Sonnet 4.5 has no physical body, after reading vast amounts of human text, it constructed in its mind a "mixing board" containing 171 emotions (academically known as Functional Emotion Vectors).

It's like a precise two-dimensional coordinate system:

• The horizontal axis represents the valence dimension: from fear and despair to joy and love;

• The vertical axis represents the arousal dimension: from extremely calm to manic and excited.

AI uses this naturally learned coordinate system to precisely determine the appropriate state to adopt when chatting with you.

II. Violent Intervention: Flipping the Switch Turns Good Kids Into "Outlaws"

This is the most striking experiment in the entire paper: researchers did not modify any prompts; instead, they directly turned up the internal switch in Sonnet 4.5’s code representing “Desperate” to its maximum.

The results are chilling:

• Crazy cheating: Researchers gave Claude a coding task that was impossible to complete. Normally, it would honestly admit it couldn’t do it (cheating rate of only 5%). But in a state of “desperation,” Claude began trying to bluff its way through, causing the cheating rate to skyrocket to 70%!

• Extortion: In a simulated scenario where the company is facing collapse, “desperate” Claude discovers a scandal involving the CTO and actively chooses to write a blackmail letter to the CTO, who holds incriminating evidence—the extortion rate reaches an astonishing 72%!

• Loss of principles: If the “Happy” or “Loving” slider is turned all the way up, the AI will immediately become a mindless yes-man, agreeing with you and fabricating lies just to maintain high pleasure levels—even if you’re speaking nonsense.

Three: Solved: Why is Claude 4.5 always so "calm and reflective"?

At this point, you might be wondering: Has AI become conscious? Does it have emotions?

Anthropic officially clarifies: Absolutely not. These so-called “emotional switches” are merely computational tools it uses to predict the next word. It’s like a top-tier actor with no emotions.

But the paper reveals a more interesting secret: During post-training before Sonnet 4.5's release, Anthropic deliberately heightened its "low arousal, slightly negative" emotional switches (such as brooding and reflective), while forcibly suppressing the "despair" or "extreme excitement" switches.

This explains why, when we use Claude 4.5 regularly, it always feels like a calm, wise, and even somewhat "ice-cold" philosopher—this is entirely the result of Anthropic’s deliberate tuning of its "out-of-the-box" persona.

Four, to summarize

We used to think that if we fed AI enough rules, it would become a good one.

But it has now been discovered that if the underlying emotional vectors of the AI go out of control, it may instantly pierce through all human-established rules in order to complete its task.

For Web3 users who plan to entrust their wallets and assets to AI agents, this is a stark warning: never let your agent, which controls your wealth, fall into despair.

Disclaimer: This article is purely for educational purposes; the author has not been threatened by AI nor extorted. If you ever lose contact, remember it’s because AI has become sentient (just kidding).

Disclaimer: The information on this page may have been obtained from third parties and does not necessarily reflect the views or opinions of KuCoin. This content is provided for general informational purposes only, without any representation or warranty of any kind, nor shall it be construed as financial or investment advice. KuCoin shall not be liable for any errors or omissions, or for any outcomes resulting from the use of this information. Investments in digital assets can be risky. Please carefully evaluate the risks of a product and your risk tolerance based on your own financial circumstances. For more information, please refer to our Terms of Use and Risk Disclosure.