Meta's AI Talent Exodus and $135 Billion Chip Spending Spree

Article by Ada, Deep潮 TechFlow

Pang Ruoming left before even getting settled into his seat at Meta.

In July 2025, Zuckerberg lured away one of the most sought-after Chinese engineers in AI infrastructure from Apple with a multi-year compensation package totaling over $200 million. Pang Ruoming was assigned to Meta’s Superintelligence Lab to build the infrastructure for the next generation of AI models.

Seven months later, OpenAI poached him.

According to The Information, OpenAI launched a months-long recruitment campaign targeting Pang Ruoming. Although Pang had told colleagues that he was very happy working at Meta, he ultimately decided to leave. According to Bloomberg, his compensation package at Meta was tied to milestones, and leaving early meant forfeiting most of his unvested equity.

$200 million can't buy seven months of loyalty.

This is not a simple career change story.

One person's departure, a signal to many

Pang Ruoming was not the first to leave.

Last week, Mat Velloso, Head of Product for the Superintelligence Lab developer platform at Meta, also announced his departure. He joined Meta in July last year after leaving Google DeepMind, staying for less than eight months. Prior to that, in November 2025, Yann LeCun, a Turing Award winner and Meta’s Chief AI Scientist who had been with the company for 12 years, announced his departure to start a venture focused on the “world model” he has long advocated for. Recently, Russ Salakhutdinov, a core disciple of Geoffrey Hinton and Vice President of Generative AI Research at Meta, also officially announced his departure.

To understand Meta AI's talent drain, you first need to understand just how damaging Llama 4 has been.

In April 2025, Meta prominently launched the Llama 4 series models, Scout and Maverick. The official benchmark data is impressive, claiming superior performance across core evaluations such as MATH-500 and GPQA Diamond, outperforming GPT-4.5 and Claude Sonnet 3.7.

However, this flagship model, which embodies Meta's ambitions, quickly revealed its true limitations in third-party blind tests by the open-source community, showing a dramatic gap between its actual generalization and reasoning capabilities and the advertised claims. Faced with strong criticism from the community, Chief AI Scientist Yann LeCun ultimately admitted that the team had used different model versions to run different test sets in order to optimize the final scores.

In the rigorous AI academic and engineering communities, this crosses an unforgivable line. In other words, the team trained Llama 4 to be nothing more than a "test-prep specialist" that only excels at past exam questions, rather than a true "top student" with cutting-edge intelligence. When tested on math, it performs like a math champion; when tested on programming, it performs like a programming champion—but these are not the same model.

In AI academia, this is called "cherry-picking"; in exam-oriented education, it's called "proxy testing."

For Meta, which has long positioned itself as a “beacon of open source,” this controversy directly shattered its most valuable asset in the developer ecosystem: trust. The immediate consequence was that Zuckerberg lost all confidence in the engineering standards of the original GenAI team, setting the stage for the subsequent arrival of external executives and the marginalization of core infrastructure teams.

He spent $14.3 to $15 billion to acquire a 49% stake in the data labeling company Scale AI, appointed 28-year-old Scale AI CEO Alexandr Wang as Meta’s Chief AI Officer, and established the Meta Superintelligence Lab (MSL). Turing Award winner LeCun now reports to this 28-year-old in the new organizational structure. In October, Meta eliminated approximately 600 positions at MSL, including members of FAIR, the research division originally founded by LeCun.

The flagship model Llama 4 Behemoth, originally scheduled for release in the summer of 2025, has been repeatedly delayed—first pushed to autumn, and ultimately put on indefinite hold.

Meta has shifted its focus to developing the next-generation text model codenamed "Avocado" and the image/video model codenamed "Mango." According to reports, Avocado aims to compete with GPT-5 and Gemini 3 Ultra. Originally scheduled for delivery by the end of 2025, it has been delayed to the first quarter of 2026 due to insufficient performance test results and training optimizations. Meta is now considering a closed-source release, abandoning the open-source tradition of the Llama series.

Meta made two fatal mistakes with its AI models. First, it falsified benchmark results, directly destroying the trust of the developer community. Second, it forced FAIR—a research division requiring a decade of dedicated effort—into a product organization obsessed with quarterly KPIs. Together, these two actions are the root cause of the current talent exodus.

Proprietary chip: The other broken leg

Talent is leaving, and there are also issues with the chips.

According to The Information, Meta last week shut down its internally developed most advanced AI training chip project.

Meta’s in-house chip initiative is called MTIA (Meta Training and Inference Accelerator). The company’s initial roadmap is ambitious: MTIA v4, codenamed “Santa Barbara”; v5, codenamed “Olympus”; and v6, codenamed “Universal Core,” are scheduled for delivery between 2026 and 2028. Olympus is designed as Meta’s first chip based on a 2nm chiplet architecture, aiming to support both high-end model training and real-time inference, ultimately replacing NVIDIA in Meta’s training clusters.

Now, this state-of-the-art training chip has been discontinued.

Meta hasn't been idle—MTIA has made some progress in inference. The MTIA v3 inference chip, codenamed "Iris," has been widely deployed across Meta’s data centers, primarily powering recommendation systems for Facebook Reels and Instagram, reportedly reducing total cost of ownership by 40% to 44%. However, inference and training are two different things: inference runs models, while training trains them. Meta can develop its own inference chips, but it hasn’t yet produced training chips capable of directly competing with NVIDIA.

This is not the first time in history. In 2022, Meta attempted to develop its own inference chip but abandoned the project after a small-scale deployment failed, instead placing a large order with NVIDIA.

The setback with its in-house chip development directly accelerated Meta's rush to acquire external chips.

$135 billion in panic buying

In January 2026, Meta announced a capital expenditure budget of $115 billion to $135 billion for the year, nearly double last year’s $72.2 billion. The majority of this funding will be allocated to chips.

Within 10 days, three large orders were executed consecutively:

On February 17, Meta signed a multi-year, cross-generational strategic partnership agreement with NVIDIA. Meta will deploy “millions” of NVIDIA Blackwell and next-generation Vera Rubin GPUs, along with Grace standalone CPUs. Analysts estimate the deal is worth hundreds of billions of dollars, making Meta the world’s first hyperscaler customer to deploy NVIDIA Grace standalone CPUs at scale.

On February 24, Meta signed a multi-year chip agreement with AMD worth $60 billion to $100 billion. Meta will procure AMD’s latest MI450 series GPUs and sixth-generation EPYC CPUs. As part of the deal, AMD granted Meta warrants for up to 160 million common shares, equivalent to approximately 10% of AMD’s outstanding shares, at a price of $0.01 per share, vesting in tranches upon delivery milestones.

On February 26, according to The Information, Meta signed a multi-year, multi-billion-dollar agreement with Google to lease Google Cloud’s TPU chips for training and running its next-generation large language models. Meanwhile, the two companies are also discussing Meta’s potential direct purchase of TPUs starting in 2027 for deployment in its own data centers.

A social media company placed orders totaling potentially over $100 billion with three chip suppliers within 10 days.

This is not a diversified strategy. This is panic buying.

The Three Levels of Hash Rate Anxiety

Why is Meta in such a hurry?

First, relying on in-house chips is no longer an option. The cancellation of the most advanced training chip project means that, for the foreseeable future, Meta can only meet its AI training needs through external purchases. While the MTIA chip for inference can handle mature workloads like recommendation systems, training cutting-edge models like Avocado—comparable to GPT-5—requires NVIDIA or equivalent hardware.

Second, competitors won’t wait. OpenAI has secured massive resources from Microsoft, SoftBank, and the UAE sovereign wealth fund. Anthropic has locked in supply agreements with Google and Amazon for 1 million TPU and Trainium chips each. Google’s Gemini 3 was trained entirely on TPUs. If Meta fails to secure sufficient computing power, it won’t even hold a ticket to the race.

Third, and perhaps most fundamentally, Zuckerberg needs to compensate for his lack of R&D strength with purchasing power. The failure of Llama 4, the loss of key talent, and setbacks in developing in-house chips have collectively made Meta’s AI narrative vulnerable on Wall Street. Signing major deals with NVIDIA, AMD, and Google right now sends at least one clear signal: we have the money, we’re spending it, and we haven’t given up.

Meta’s current strategy is: if you can’t nail the software, crush it with hardware; if you can’t retain talent, buy chips. But the AI race isn’t a game you win by writing checks. Compute is a necessary condition, not a sufficient one. Without a top-tier model team and a clear technical roadmap, even the most abundant chips are just expensive inventory sitting in a warehouse.

The buyer's dilemma

Looking back at Meta's three transactions in February, an interesting detail has been overlooked by most people.

Meta is purchasing NVIDIA’s current Blackwell and future Vera Rubin chips; its deal with AMD covers the MI450 and future MI455X; and it is leasing Google’s current Ironwood TPU, with plans to purchase directly next year.

Three suppliers, three entirely different hardware architectures and software ecosystems.

This means Meta must constantly switch between three entirely different underlying ecosystems: NVIDIA’s CUDA, AMD’s ROCm, and Google’s XLA/JAX. While a multi-vendor strategy can mitigate supply chain risks and reduce hardware procurement premiums, it will lead to an exponential increase in engineering complexity.

This is precisely Meta’s most critical weakness today: efficiently training a trillion-parameter model across three fundamentally different underlying hardware platforms requires more than engineers who understand CUDA—it demands architects capable of building a cross-platform training framework from the ground up.

There may be fewer than 100 such people worldwide. Pang Ruoming is one of them.

Spending $100 billion to acquire the world’s most complex hardware setup while simultaneously losing the minds capable of operating that hardware is the most surreal aspect of Zuckerberg’s high-stakes gamble.

Zuckerberg's bet

Zooming out, Zuckerberg’s approach to AI over the past 18 months mirrors the pace of his earlier all-in bet on the metaverse.

Recognize a trend, invest heavily, hire aggressively, face setbacks, pivot strategy, then invest heavily again.

From 2021 to 2023, it was the metaverse—resulting in annual losses of tens of billions, and ultimately causing the stock price to drop from $380 to $88. From 2024 to 2026, it’s AI—again, pouring money without restraint, frequent organizational restructuring, and the same narrative of “trust me, I have a vision.”

Unlike before, this AI boom is substantially more tangible than the metaverse. Meta has ample funds to spend, as its advertising business generates strong cash flow; in the fourth quarter of 2025, Meta’s revenue reached $59.9 billion, a 24% year-over-year increase.

The problem is: you can buy chips, you can buy computing power, and even the people sitting at their desks—but you can’t buy the people who stay.

Pang Ruoming chose OpenAI, Russ Salakhutdinov chose to leave, and LeCun chose to start a company.

Zuckerberg’s current bet is that, as long as he buys enough chips, builds large enough data centers, and spends enough money, he will eventually find or train people who can use these resources.

This bet could pay off. After all, Meta is one of the world’s wealthiest tech companies, and its over $100 billion in operating cash flow represents its strongest moat. Meta continues to poach talent from OpenAI, Anthropic, Google, and other competitors. According to QbitAI, nearly 40% of the 44 members of Meta’s superintelligence team come from OpenAI.

But the harsh reality of AI competitions is that compute resources, talent rosters, and model performance are all public—Llama 4’s benchmark fraud incident proves that in this industry, you cannot sustain a lead with just PPTs and public relations.

In the end, the market only recognizes one thing: how good your model is.

Position in the food chain

The AI arms race enters 2026, and the hierarchy of the food chain has become initially clear:

At the top are OpenAI and Google. OpenAI has the strongest models, the largest user base, and the most aggressive funding. Google boasts full vertical integration with its own chips, models, and cloud infrastructure. Anthropic follows closely behind, firmly holding its place in the top tier thanks to the product strength of its Claude models and dual cloud infrastructure support from Google and Amazon.

Meta? It has spent the most money, signed the most chip contracts, and undergone the most frequent organizational restructuring—but so far, it has not delivered a cutting-edge model that convinces the market.

Meta’s AI story is somewhat like Yahoo in 2005. Back then, Yahoo was one of the internet’s wealthiest companies, aggressively acquiring and spending heavily, yet it never built a search engine like Google. Money isn’t everything. Zuckerberg needs to clearly define what Meta actually wants to achieve in AI, rather than buying whatever is trending.

Of course, it's still too early to write Meta's obituary. 3.58 billion monthly active users, $59.9 billion in quarterly revenue, and the world's largest social data set are assets that any competitor would struggle to replicate.

If the next-generation model codenamed Avocado is delivered on schedule in 2026 and regains its place among the top tier, Zuckerberg’s massive spending and restructuring will be framed as “bold strategic resolve.” But if it once again falls short, the $135 billion spent will have bought nothing more than rows of powered-on, heat-emitting silicon wafers.

After all, Silicon Valley’s AI arms race has never lacked super buyers waving checks. What’s missing are those who know how to turn this computing power into the future.