NVIDIA Blackwell Outperforms AMD MI355X in AA-AgentPerf DeepSeek V4 Pro Benchmark

iconCryptoBriefing
Share
Share IconShare IconShare IconShare IconShare IconShare IconCopy
AI summary iconSummary

expand icon
NVIDIA’s Blackwell systems, including the B200 and GB300, outperformed AMD’s MI355X in the AA-AgentPerf DeepSeek V4 Pro benchmark. The test, which uses real-world coding tasks, shows NVIDIA’s edge in power-efficient agentic inference. As inflation data remains a concern, data centers favor energy-efficient solutions. The fear and greed index reflects market sensitivity to cost and performance trade-offs. Results are normalized per accelerator and per megawatt.

Artificial Analysis has dropped something the AI hardware world has been quietly waiting for: an actual benchmark that measures how well chips handle agentic AI workloads in the real world. The benchmark is called AA-AgentPerf, and its initial results running DeepSeek V4 Pro tell a story that AMD probably would rather not hear right now.

NVIDIA’s Blackwell systems, specifically the B200 and GB300, consistently outperformed AMD’s Instinct MI355X GPUs on power-efficient agentic inference.

What AA-AgentPerf actually measures

It’s the first multi-vendor open benchmark from Artificial Analysis designed specifically for hardware performance in agentic coding tasks.

Advertisement

The benchmark evaluates how many concurrent agents a system can support while meeting specific service-level objectives. Those SLOs cover output token speeds ranging from 20 to 300 tokens per second and time-to-first-token (TTFT) targets between 3 and 10 seconds.

Rather than relying on synthetic evaluation methods, the benchmark leverages actual coding trajectories. Results are then normalized per accelerator and per megawatt, which creates a comparison framework that accounts for both raw performance and energy consumption.

DeepSeek V4 Pro enters the chat

The model at the center of this benchmark is DeepSeek V4 Pro, which has been turning heads since its release around April 2026. It scored 1554 on the GDPval-AA benchmark, placing it firmly among the top-performing open-weights models available today.

DeepSeek V4 Pro (Max) also earned a score of 52 on the Artificial Analysis Intelligence Index, ranking it second among open-weights reasoning models.

NVIDIA vs. AMD and what it means for the data center market

The initial AA-AgentPerf results paint a clear picture of competitive positioning. NVIDIA’s Blackwell architecture, represented by the B200 and GB300 systems, delivered superior performance per watt compared to AMD’s MI355X across the tested agentic workloads.

The per-megawatt normalization is especially telling. Data centers are increasingly constrained not by rack space or capital budgets but by power availability. A chip that can support more concurrent agents per megawatt of power consumed has a tangible, quantifiable advantage that translates directly to the bottom line.

For NVIDIA, these results reinforce a narrative the company has been building around Blackwell’s efficiency characteristics. The timing is notable: the performance leadership data was reported relative to a June 12, 2026 crawl date, suggesting NVIDIA moved quickly to publicize favorable results through its developer blog.

Disclaimer: The information on this page may have been obtained from third parties and does not necessarily reflect the views or opinions of KuCoin. This content is provided for general informational purposes only, without any representation or warranty of any kind, nor shall it be construed as financial or investment advice. KuCoin shall not be liable for any errors or omissions, or for any outcomes resulting from the use of this information. Investments in digital assets can be risky. Please carefully evaluate the risks of a product and your risk tolerance based on your own financial circumstances. For more information, please refer to our Terms of Use and Risk Disclosure.