DeepMind partners with EVE Online to test AI in a 23-year-old virtual universe

Demis Hassabis, CEO of DeepMind and the father of AlphaGo, has been using games for AI research for over a decade.

This time, he threw AI into the "living universe" that has been running for 23 years: EVE Online, a space MMO so complex that even its tutorial can deter new players.

A game of chess has an end, but EVE does not.

In early May, DeepMind announced a research partnership with EVE Online, simply because EVE’s complex, player-driven universe serves as a perfect safe sandbox for testing AI memory, continuous learning, and long-term planning.

DeepMind partnered with EVE not to pursue an entertaining gaming experience or enhance gameplay, but to tackle the three most challenging problems in current AI agent research—Hassabis bet the answer on a 23-year-old game.

Fenris Creations (formerly CCP Games) announces a partnership with DeepMind

On May 6, the company behind EVE Online announced four things on the same day:

Re-establish independence from parent company Pearl Abyss;
Renamed to Fenris Creations;
Completed a $120 million transaction;
As part of this independent initiative, Google holds a minority stake in Fenris Creations and has simultaneously launched a research collaboration with Google DeepMind.

Fenris Creations CEO Hilmar Veigar Pétursson stated in the announcement:

This transition does not involve any layoffs or restructuring; the team, products, and development plans remain unchanged. EVE continues.

From operational metrics, this company is coming to the table with "real ammunition," not selling assets to survive.

EVE Online generated over $70 million in revenue in 2025, with November setting a record for the highest monthly income in its history, and Q4 becoming the second-highest revenue quarter in the game's 20-year history.

Fenris Creations' spin-off means EVE now has a parent company that can independently determine research collaborations, no longer bound by the strategic goals of a larger game publisher.

The 1997 board game box published by Fenris. The name "Fenris" predates EVE Online by six years; renaming it Fenris Creations is a retroactive rebranding, not a fresh start.

Why did DeepMind choose EVE?

2023 "Artificial Society"

AI benchmarking is difficult to replicate.

Many people, upon hearing "gaming + AI research," immediately think of AlphaGo or AlphaStar, but EVE is different from them.

Go and StarCraft share a common characteristic: each match has a beginning, an end, and clear rules for determining victory or defeat.

AlphaGo aims to win a game of Go, AlphaStar aims to win a StarCraft match—both follow the paradigm of "single-session intelligence," but EVE has no final outcome.

EVE Online is known for its single-shard, single shared universe, where thousands of players compete, trade, form alliances, and wage war within the same persistent world.

Players have established real economic systems, political alliances, military factions, trade routes, historical grievances, and multi-year war plans here.

Some campaigns take a full year from planning to conclusion. The rise and fall of certain alliances are studied by later players as real history.

Hilmar said in the announcement: "EVE is one of the few places where you can explore intelligent problems in an environment that already operates like the real world."

Hassabis also mentioned that he has been playing games since childhood, began his career designing AI-based game simulations, and that the research behind AlphaGo, AlphaStar, and SIMA is deeply tied to gaming—EVE is now the next step:

I’m excited to collaborate with Fenris Creations to safely explore new gaming experiences and advance AI research in their player-built, intricately detailed universe.

Most AI benchmarks are like health check-ups; EVE is more like throwing an AI into a "synthetic society" that has been ongoing for 23 years.

The three toughest challenges for the agent

Just another day in the life of an EVE player.

The official announcement clearly outlined three research directions: long-horizon planning, memory, and continual learning.

These three directions are widely recognized as the three most challenging problems in current AI agent research.

If someone you know has played EVE Online for over ten years, ask them to open their account and show you their friends list—you’ll likely see dozens of groups and hundreds of names, with notes like “Debt from the 2018 Delve campaign,” “Traitor within Goonswarm, don’t cooperate,” or “This guy’s a spy; everyone in the org knows it.”

This is not a context window, but long-term memory across sessions, starting from a decade ago.

Players of EVE pass the test of memory every day, and so do they pass the test of continuous learning.

In January 2014, the Battle of B-R5RB lasted approximately 21 hours, involved over 7,500 characters, resulted in the destruction of 75 Titans, and incurred losses equivalent to about $300,000 in real currency. The entire battle was triggered by an unpaid sovereignty bill that failed to auto-pay.

After this battle, fleet tactics across the entire game were rewritten. For years afterward, every alliance refined their fleet compositions and tactical systems based on post-battle analysis. Changes were made monthly, and every defeat was broken down into actionable strategy updates.

For long-term planning, the standard time unit in EVE alliance warfare is not hours, but months. A cross-system war, from preparation to launch, involves hundreds of players spontaneously coordinating—building ships, transporting resources, conducting diplomacy, infiltrating, and counterintelligence—without any task assignments, advancing together over months toward a shared goal.

This collaborative system evolved organically among players over 23 years.

The three toughest challenges in current AI agent evaluations happen to be the daily routines of EVE players.

Twenty-three years of player-driven evolution in EVE have created an environment that is constantly changing, inherently complex, and devoid of shortcuts—complexity that cannot be artificially synthesized in a laboratory.

DeepMind's SIMA 2, released in November 2025, has evolved from "executing instructions" to "understanding goals, reasoning through processes, and learning while playing."

From the perspective of the research question, the EVE project, like SIMA 2, belongs to the path of "games as agent training grounds," except that this time the arena is a real universe that has been operating continuously for 23 years.

In-game battle footage from EVE Online—large-scale conflicts, spontaneously organized by players and often lasting for hours, were the key reason DeepMind chose EVE as a research environment for long-term planning and continuous learning.

DeepMind is in an offline sandbox.

Not the Player Universe

DeepMind's collaboration with Fenris is more conservative than expected; DeepMind did not receive direct access to the live servers of active players.

DeepMind's official announcement states: The initial research will be conducted on an offline version of EVE Online, using local servers to test and evaluate the model in a controlled environment, without connecting to the live EVE Online servers.

On one hand, the offline version means DeepMind does not consume live player match data or disrupt the real server economy, avoiding any privacy or compliance complexities.

On the other hand, the offline version of EVE can still retain its core designs, including the complex rule system, ships and economic mechanisms, and star region structure.

DeepMind received a complex world that has been stress-tested by 23 years of players, serving as an exam for the agent to survive in.

From Atari to EVE

Where does this road lead?

Looking back at the training grounds DeepMind has chosen over the past decade, a clear evolutionary path emerges.

From 2013 to 2015, Atari served as the starting point. DQN placed agents in clearly defined, rule-bound games like Breakout and Space Invaders, testing their reaction speed and value estimation.

From 2016 to 2017, AlphaGo and AlphaZero. Go has well-defined rules, a vast but closed action space, and tests search and long-chain reasoning.

In 2019, AlphaStar entered StarCraft II, entering for the first time a real-time, imperfect-information, multi-agent environment, testing real-time decision-making under partial observability.

In 2024, SIMA aims to create a general agent that works across multiple games, focusing on transfer and generalization.

In 2025, the SIMA 2 upgrade: not only executing commands, but also conversing with users, reasoning about goals, and improving itself during gameplay.

DeepMind's SIMA 2, released in 2025, has evolved from "executing instructions" to "understanding goals, reasoning through processes, and learning while playing."

Each generation of environment incorporates more of the characteristics of the real world: transitioning from closed rules to open rules, from perfect information to imperfect information, and from single-game competition to cross-game transfer.

However, these environments were mostly relatively closed, separable, and repeatable evaluation tasks—for example, Atari featured fixed-rule arcade games, AlphaStar competed in discrete matches of StarCraft, and SIMA tested cross-game generalization across multiple 3D virtual environments.

What sets EVE apart is that it is a persistent world with long-running, player-driven systems that continuously evolve in terms of economy and politics.

It evolved organically over 23 years through real players operating within an open-rule world: a fully player-driven economy (with ISK price fluctuations rivaling real financial markets), cross-alliance political structures (diplomacy, espionage, ceasefire agreements), and a complete warfare ecosystem ranging from small-scale skirmishes to 21-hour mega-battles.

The industry's consensus on agent evaluation is becoming clearer: single-task benchmarking has long stopped offering anything new, but there has still been no proper evaluation framework for long-term memory, cross-week planning, and learning from failure.

So, DeepMind’s choice this time was: rather than creating a new synthetic environment, enter a “human-made society” that has already been stress-tested by players for 23 years.

But a bigger issue has also emerged:

What is still missing between an AI agent that can persist, learn, and plan within EVE, and an autonomous agent operating in the real world?

Reference materials:

https://x.com/GoogleDeepMind/status/2052011542707630461

https://www.ccpgames.com/news/2026/studio-behind-eve-online-goes-independent-rebrands-as-fenris-creations-enters-research-partnership-with-google-deepmind

https://deepmind.google/blog/sima-2-an-agent-that-plays-reasons-and-learns-with-you-in-virtual-3d-worlds/

This article is from the WeChat public account "New Intelligence Yuan," authored by ASI Revelation, edited by Yuan Yu.