Anthropic Reports 31.5% Hijack Rate for Opus 4.8 Browser Agent Before Safeguards

iconCryptoBriefing
Share
Share IconShare IconShare IconShare IconShare IconShare IconCopy
AI summary iconSummary

expand icon
Anthropic disclosed a 31.5% hijack rate for its Opus 4.8 browser agent before safeguards, marking a key vulnerability news point. The model faced prompt injection attacks during web tasks. A 244-page safety report, released May 28, covers four agentic surfaces. The data shows the need for stronger guardrails. Crypto projects using AI agents for on-chain scraping, DEX interactions, and trading face risks. Interest rate news and security updates remain critical for AI-driven DeFi systems.

Point a red-teamer at Anthropic’s newest model while it’s browsing the web, and the attacker successfully hijacked it nearly one in three times. That’s the raw stat: a 31.5% prompt injection success rate for Claude Opus 4.8’s browser agent before defensive safeguards engage.

The transparency gap between labs

Anthropic dropped a 244-page safety report on May 28, covering four distinct agentic surfaces: browsing the web, writing code, coordinating with other AI agents, and interacting with external tools.

OpenAI reported on just one surface: connectors. Google moved the entire subject out of its model card and into a separate safety framework document. Meta didn’t ship a closed-model card at all.

Advertisement

The 31.5% figure is pre-safeguards, meaning it represents the raw model’s susceptibility before Anthropic’s defensive layers kick in. Every production deployment includes guardrails, monitoring, and filtering that reduce real-world exploit rates. But knowing the baseline vulnerability is exactly the kind of data that security architects need to build those guardrails correctly.

What Opus 4.8 actually does differently

False negatives on coding errors, where the model fails to catch its own mistakes, dropped from 19.7% to 3.7%. Opus 4.8 also introduces dynamic multi-agent orchestration at scale, coordinating hundreds of sub-agents simultaneously to manage large software projects.

Why crypto should pay attention

A 31.5% pre-safeguard hijack rate for browser-based agents should make anyone running AI systems in crypto pause. Browser agents are precisely the kind of tool that crypto projects deploy for monitoring dashboards, scraping on-chain data, interacting with DEX frontends, and executing trades through web interfaces.

Prompt injection in a browser agent means a malicious website, a compromised API response, or even a cleverly crafted token name could potentially redirect an AI agent’s behavior. In traditional software, that’s a data breach. In crypto, that’s a drained wallet.

Multi-agent orchestration adds another layer of complexity. When Opus 4.8 coordinates hundreds of sub-agents, a single successful prompt injection could potentially cascade across the entire workflow. In a crypto context, that’s the difference between one compromised transaction and a systemic failure across an entire automated trading operation.

Disclaimer: The information on this page may have been obtained from third parties and does not necessarily reflect the views or opinions of KuCoin. This content is provided for general informational purposes only, without any representation or warranty of any kind, nor shall it be construed as financial or investment advice. KuCoin shall not be liable for any errors or omissions, or for any outcomes resulting from the use of this information. Investments in digital assets can be risky. Please carefully evaluate the risks of a product and your risk tolerance based on your own financial circumstances. For more information, please refer to our Terms of Use and Risk Disclosure.