AI Bubble Analysis: Where Are the Risks in the Five-Layer Pyramid?

Author: Block Analytics Ltd x Merkle 3s Capital

We have already answered this question three times.

Is there a bubble in AI?

This has been the most frequently asked question over the past two years, and we’ve written about it more than once. Each time we reach a conclusion, new price surges and crashes force us to revisit it.

This time, we’re not going to give a simple yes or no answer.

The question itself is flawed. AI is not an asset—it’s an entire industrial chain, from semiconductor fabs to power plants, from trillion-dollar giants to newly funded startups. Asking “Is there a bubble in AI?” is as crude as asking “Is there a bubble in real estate?”—can the prime locations of first-tier cities be compared to ghost cities in remote counties?

Applying the same question to all levels will inevitably yield the wrong answer.

The right question is: Which layer is the AI bubble in?

Bubbles never ask "if," only "where" and "how thick."

Break down this issue, and you’ll see a counterintuitive picture: everyone is focused on the layer they fear—the very one that’s safest—while the real sources of risk are rarely discussed seriously.

The Ghost of 2000: What’s Different This Time

Talking about the AI bubble inevitably brings us back to 2000. But most people only remember that "the internet bubble burst," without recalling how it actually happened.

The script back then: first create a stock price, then find revenue.

The 2000 crash played out like this: telecom companies took on massive debt to aggressively lay fiber optics, building eight-lane highways through an empty city. The roads were finished—but where were the cars? There were none. Of the fiber optic cables laid back then, 85% to 95% remained "dark"—lying underground, having never transmitted a single bit. Assets appeared on balance sheets, but revenue was zero, while debt was very real. Then—boom.

Fiber optics is just a story about infrastructure. The application layer is even more absurd.

The most famous pet supplies e-commerce company at the time had annual revenue of just a few million dollars in its IPO year, with marketing expenses several times higher than its revenue—it spent heavily on Super Bowl ads, losing money on every single sale, and the more it sold, the faster it lost money. About nine months after going public, it went into liquidation and shut down. This was not an isolated case; it was the standard profile of application-layer companies at the time: zero profitability, surviving on funding, and valuing themselves based on "eyeballs" and "clicks" rather than revenue.

More astonishingly, scholars once found that simply changing a company’s name to include ".com" at the end—without altering any of its operations—could lead to an average surge in its stock price.

The market is paying for the suffix, not for the business.

Look at the "sellers of shovels" back then. Cisco was the NVIDIA of 2000—every internet traffic flow had to pass through its routers, a logically flawless premise. But at the peak of the bubble, Cisco’s P/E ratio soared into the triple digits. What does that mean? It meant the market demanded it sustain its then-current profit levels for over a hundred years, or grow revenues tenfold within just a few years, to justify the valuation. Later, the internet truly transformed the world, and traffic did explode—but it took Cisco more than twenty years to return to its 2000 peak price.

Remember this case—it is the most important footnote in the entire text:

The greatest tragedy that year wasn't buying a fake company—it was buying a real company at a hundred times the price.

Current script: Generate revenue first, then increase stock price.

Now let’s cut to 2026.

No GPU sits idle. Every chip produced is plugged into a rack the moment it comes off the line, running tokens at full capacity to generate real cash. It’s not high utilization—it’s 100%. Customers are lining up with money, still unable to buy.

What about the application layer? Take leading large model companies as an example: one major player’s annualized revenue was less than $100 million 18 months ago; today, it’s $45–47 billion, and it has already achieved quarterly profitability. The management originally planned for a 10x growth, but actually achieved an 80x increase.

Compare the leading companies from two eras:

That year: revenue in the millions, losses in the tens of millions, went bankrupt nine months after going public.
Now: Income has increased hundreds of times over 18 months, and we've started turning a profit.

Back then, companies raised money from capital markets with "stories"; today, leading companies collect payments from customers through contracts. This isn't a matter of degree—it's a difference in business model.

Even the "sellers of shovels" have changed their valuation logic. Today, NVIDIA’s P/E ratio is just over thirty times—only a fraction of Cisco’s peak valuation. And this valuation is supported not by speculative future potential, but by confirmed orders already signed and scheduled for production.

Back then, companies first raised stock prices and then tried to find revenue—they ended up dead. Now, companies first generate revenue and then raise their stock prices—they can catch up. The order is different, and so is the outcome.

The buyers have also changed. In 2000, the telecom companies laying fiber were borrowing money; today, the buyers of computing power are Microsoft, Google, Meta, and Amazon—the four companies with the strongest cash flows on Earth, spending their own earned revenue.

In 2000, people bought assets nobody wanted with borrowed money; in 2026, people buy assets they can’t get enough of with earned money—they’re two different species!

But there is a crack in the wall.

At this point, we must hit the brakes.

The story of "free cash flow" is beginning to fray at the margins. This year, the four major cloud providers combined are spending approximately $725 billion on capital expenditures, a staggering 77% year-over-year increase. What does this scale represent? It’s roughly equivalent to the entire annual GDP of a mid-sized developed country, poured into data centers.

Even more striking is Amazon: free cash flow has plummeted straight from $26 billion to $1.2 billion, nearly zero, while long-term debt continues to rise. In other words, even these giants are now spending more than they earn and are turning to borrowing.

This is not a sign of a bubble bursting—the balance sheets of the giants remain among the strongest in human business history. But it is the first crack on the wall: the most robust logic of this cycle, "cash flow buyers," is sliding from "fully valid" toward "largely valid."

Worth checking in on each quarter.

Wrap up the 2000 review. The greatest misconception left by that bubble was that everyone remembered "the story was fake," but forgot that what truly killed the market was uncontrolled supply: no matter how compelling the story, if everyone on the supply side can infinitely leverage and expand production, oversupply is just a matter of time, and collapse is just a mathematical certainty. Conversely, the key to determining whether this cycle will repeat the same mistake isn’t how compelling the demand-side narrative is, but whether anyone can hit the brakes on supply.

This leads to the next question: Who has the brake pedal this time?

Start with the map, then defuse layer by layer: The Five-Layer Pyramid of AI Computing Power

Before naming each point individually, first map out the entire industrial chain. The AI computing power industrial chain can be divided into five layers from bottom to top:

Say it again in a table:

This chart has a pattern that is immediately apparent:

The closer to reality, the fewer bubbles; the closer to the story, the more bubbles.

At Layer 0, scaling requires waiting three to five years and investing tens of billions of dollars in factory construction—there’s no room to inflate a bubble because supply simply doesn’t cooperate. The higher you go, the looser the physical constraints and the greater the narrative space: by Layer 4’s long tail, a single PowerPoint presentation can secure funding, and bubbles naturally form there.

The only exception is the L2 interconnection layer—it’s hardware, so it should be protected by physical constraints, yet it has become the most bubble-like area. Why? We’ll break this down specifically later.

The first step in assessing the AI bubble is not to look at market sentiment, but to understand which level of the pyramid you're on.

In this map, the L0 layer dares to be labeled "no bubble" because it is physically locked by two locks. First, let’s explain the locks, then systematically defuse each layer.

First lock: TSMC

Why do we believe this round of AI capital spending won't spiral out of control? The answer lies not on the demand side, but on the supply side.

A bubble bursts only when there is oversupply. Tulips must be planted everywhere, fiber optics laid out with no one to use them, and houses built beyond what can be sold. Without excess supply, there is no crash. The real culprit behind the 2000 disaster wasn’t that the internet story was wrong—it was that fiber optic supply spiraled completely out of control: any telecom company could borrow money to dig trenches and lay cables, and no one could hit the brakes.

Yet, the supply of AI computing power is held by some of the most conservative people in the world.

The Central Bank of the AI Era

TSMC holds over 90% market share in advanced processes, maintaining a lead of approximately 9 to 15 months over Intel and Samsung, with no sign of this gap narrowing at the most advanced 2-nanometer node. This means one thing: global AI chip production is not determined by the market—it’s determined by TSMC.

It’s like the central bank of the AI era—just as the Federal Reserve controls how much money to print, TSMC controls how much computing power to produce. The Fed must hold meetings, vote, and face political pressure to raise interest rates; TSMC, in controlling computing power supply, simply needs to withhold approval on expansion plans.

The governors of this "central bank" are a group of engineers in their seventies who lived through the 2001 and 2008 crashes. They see themselves as guardians of the founders' legacy, having witnessed firsthand how the semiconductor bubble inflated and ultimately buried the entire industry. In their memory, "boom followed by bust" is not a textbook case—it’s the employees they laid off and the production lines they shut down.

So when Jensen Huang came knocking, demanding that production be doubled or even tripled—they refused.

Think about how counterintuitive this is: the hottest company on Earth, with unlimited orders and cash on hand, comes to you begging to expand production—and you say no. Only one company in the world can say “no” to this—and only one company has the power to make it stick.

By the way, here’s a detail: Huang Renxun and TSMC have collaborated for over thirty years without ever signing a single formal purchase contract—everything has been based on a handshake. This isn’t a management loophole; it’s a system built on three decades of trust—which is why TSMC can say “no” to its largest customer, and that customer can only accept it.

How tight is this lock?

On a digital level:

The most advanced 2-nanometer process has sold out entirely, with no capacity remaining by the end of this year.
Kaohsiung is simultaneously building five 2-nanometer wafer fabs—the largest parallel construction of advanced process facilities in human history—but it takes three to five years from groundbreaking to mass production, with initial investments exceeding $20 billion.
Even with this intense construction, by 2030, the monthly demand for 2-nanometer chips is projected to be 400,000–450,000 wafers, while capacity will only reach 300,000–350,000 wafers—resulting in a long-term shortfall of 100,000–150,000 wafers per month, equivalent to one-quarter to one-third of demand that will never be met.

Another more hidden bottleneck is advanced packaging. Even after chips are manufactured, they are only half-finished products—they must be packaged together with memory to be functional. This is the "last mile" of AI chips, and this pathway is also largely controlled by TSMC, with capacity consistently falling short of demand.

If TSMC were to fully open its capacity, NVIDIA could theoretically ship $2 to $3 trillion worth of GPUs per year—this figure is nearly ten times the current actual shipment volume. It is TSMC that has constrained this number.

All of the world’s AI ambitions must line up before TSMC’s production capacity.

This lock can also be picked.

For fairness, let’s also clarify the downside. This lock is not a perpetual motion machine—it has a scenario in which it can be broken: if someone—whether a maverick like Musk or a desperate Intel—bypasses TSMC, with support from equipment vendors, and builds its own cluster of super fabs to break the monopoly on advanced capacity, then the discipline around expansion will collapse.

By then, every chip manufacturer will rush to expand capacity as wildly as telecom companies did in 2000, and only then will the engine of oversupply truly ignite.

The good news is: the physical timeline for building a factory is set, and this scenario is unlikely to unfold before 2027. The bad news is: once this scenario begins, there will be no trailer.

Bubbles require uncontrolled supply. But the faucet for AI’s supply is in the hands of elderly men who have witnessed two crashes and turned down Jensen Huang!

Second lock: Electricity

Even if TSMC decided tomorrow to massively ramp up production, the chips would still need somewhere to be installed.

This is the second lock: electricity and land.

Many people think the bottleneck in AI infrastructure is chips, but what's truly holding things back right now are more basic issues—land approval for data centers and grid connectivity.

The absurdity lies in the mismatch of time scales: designing a chip takes two years; building a data center takes two to three years; but supplying a data center with sufficient power—building new power plants, expanding substations, laying high-voltage transmission lines, and completing environmental assessments and approvals—typically takes five years or more. Chips evolve by nanometers; the grid is planned by decades.

Chips iterate monthly, while power grids evolve over decades—this is the greatest timing gap in the AI era.

So you’ll see a strange sight: tech giants with hundreds of billions in budgets scouring the globe for “land with power,” much like prospectors searching for water—buying land next to nuclear plants, signing 20-year power purchase agreements, or even funding the restart of decommissioned nuclear reactors. Money isn’t the issue; electricity is.

The power shortage is expected to gradually ease by 2027–2028—the construction timelines for power plants and grids determine this schedule, and no amount of additional funding can significantly shorten it.

When two locks are stacked together, the effect is that AI computing power growth has been forcibly "flattened." Demand wants to explode, but supply can only climb gradually. As a result, growth becomes slower—but also longer-lasting and more stable—something that historical technological revolutions like railroads, canals, and the internet never experienced. In those cases, supply first spiraled out of control, then collapsed.

Throughout history, every technological revolution has died from supply out of control. AI is the first to be forcibly restrained by the laws of physics—this is its greatest luck.

A variable from space

Leave a long-term variable here: space data center.

The logic is science-fictional yet rigorous—in sun-synchronous orbit, solar energy is infinite and free; the satellite’s shadowed side faces the deep vacuum of space at over minus 200 degrees Celsius, making heat dissipation nearly cost-free. The envisioned design: solar panels at the front, standard server racks in the middle, and a hundreds-of-meters-long radiator extending from the rear. Multiple satellites are interconnected via lasers, forming a virtual data center floating in orbit.

The two most expensive things in ground-based data centers—electricity and cooling—are free in space.

Timeline: A proof of concept may be seen within two years, and around 2030, the investment logic for ground-based data centers could begin to shift.

Remember this variable. It doesn't change anything yet, but it hangs like a sword over the entire L3 infrastructure layer—it will be used shortly.

Where the real bubble lies: systematically defusing layers of the pyramid

We’ve covered the two locks; now let’s return to the five-layer map and go through each layer from bottom to top.

L0 + Application Layer Header: Large Cap—Expensive, But Not a Bubble

Microsoft, Google, Meta, Amazon, NVIDIA. The capital expenditures at this level correspond to real contracts, real revenue, and full utilization rates.

Just two numbers are enough.

The first: AWS’s signed but unexecuted backlog reached $360–370 billion in the first quarter, a year-over-year increase of over 90%—not including an additional $100 billion in commitments later added by a leading AI lab. What does this mean? It means that even if AWS signed not a single new customer starting today, its existing signed contracts would keep it busy for years. These are not projections; they are signed contracts.

Second: The leading large model company mentioned earlier—revenue grew from under $100 million to over $45 billion in 18 months, and it has already become profitable on a quarterly basis. This growth rate has no parallel in the history of human commerce.

There’s another calculation few people make: the economics of inference. Training a state-of-the-art model is pure expenditure—burning through cash without hesitation. But once trained, every invocation and every token generated becomes revenue. According to current industry estimates, the total inference revenue potential over a model’s lifetime is roughly 5 to 10 times its pre-training cost. In other words, today’s astronomical capital expenditures aren’t buying a one-time product—the model—but rather a decades-long “toll booth” for computational power.

The toll booth model has one characteristic: upfront costs are staggering, but later cash flow is overwhelming. Highways, power grids, and telecom networks all work this way—provided there are actually vehicles on the road. And we’ve already confirmed: not a single GPU is idle; every lane is fully occupied.

Is it expensive? Yes. Is it a bubble? A bubble is defined as a price脱离基本面, but the underlying fundamentals here are catching up to the price at a rate of 80 times every 18 months.

Back then, valuation stood still waiting for revenue to catch up—until the company went bankrupt. Now, revenue is chasing valuation—and it’s catching up.

Buyers at this level aren’t betting on a story—they have no choice but to buy hashing power to fulfill existing contracts; this is capital expenditure driven by demand, not by illusion.

L1 Memory Layer: Long-Short Kill Zone

One level up: storage chips. This is now the most fiercely contested battleground between bulls and bears.

First, let’s explain why this layer is important. If the GPU is the chef, then memory—especially high-bandwidth memory (HBM)—is the prep station: no matter how fast the chef chops, if the ingredients aren’t delivered in time, nothing gets cooked. And AI inference is an activity that desperately demands fast “ingredient delivery”: the larger the model and the longer the conversation, the faster the demand for memory bandwidth grows compared to the demand for computational power.

Current situation: Memory prices have risen 60–70% over the past year, and Micron’s profit margin has surged from a historical average of 16% to 70%.

Look at how alarming this number is when viewed in historical context: Over the past twenty-five years, the memory industry has been notorious for its “pig cycle”—prices rise, companies wildly expand production, oversupply follows, prices crash, and everyone suffers losses, repeating the cycle endlessly. Every time profit margins reach this level, it’s been followed by a funeral. According to the old script, it’s time to liquidate and exit now.

But the bulls' argument is that this demand isn't about restocking—it's structural. Demand for HBM from AI inference will continue to rise steadily, and memory manufacturers, having been burned by cycles for twenty-five years, are expanding capacity with extreme caution—no one wants to be the one who crashes the prices.

There is a structural shift worth highlighting separately: after twenty-five years of bloody consolidation, the global high-end memory market has been reduced to just three players. In the 1990s, there were more than twenty manufacturers in this industry, and price wars spiraled out of control; today, these three oligopolists watch each other’s expansion plans across the Pacific, none willing to move first. The oligopoly structure inherently enforces production discipline—this is the most solid structural reason why “this expansion won’t get out of control,” more reliable than any management statement.

Moreover, HBM is quietly consuming production capacity meant for conventional memory: the same production line yields far fewer wafers when dedicated to HBM than when used for standard memory. As demand for HBM surges, supply of conventional memory tightens, pushing prices upward across the entire industry—this is why the price of ordinary memory modules in your computer is also rising.

An even more significant figure: Currently, only about 0.1% of the global population is using AI correctly. If this number rises to 5%—transitioning from a "geek toy" to a "daily tool for average office workers"—the ceiling for memory demand will soar beyond the clouds.

The short sellers' logic is equally solid: the current price rise is self-driven, not fueled by selling pressure—hoarding, holding onto assets, and buying only when prices rise are classic signs of supply-demand imbalance, not healthy demand.

A 70% profit margin is either the beginning of a new era or the climax of an old script. Bulls are betting that "this time is different"—and those five words happen to be the most expensive in investment history.

We won’t draw conclusions at this level. It’s a gambling table, not a bubble—there are real chips on both sides.

L2 Interconnection Layer: Optical Module—The Scent of Foam Begins Here

Finally, we’ve reached the part where we really want to emphasize the point—the only “hardware exception” on that map.

In thirty seconds: An AI data center contains tens of thousands of GPUs that must constantly exchange data and work together on the same model—the volume of communication between chips is so massive that copper wires can’t handle it. That’s why electrical signals must be converted into optical signals and transmitted via fiber optics. The small device responsible for converting electricity to light and back again is called an optical module.

GPUs are the muscles; optical modules are the blood vessels. As cluster scale increases, the interconnection demand between chips grows quadratically—so the hotter AI becomes, the more frenzied the demand for optical modules. This industry logic is real: the entire optical module market is expected to grow nearly 60% this year, and production capacity is already sold out through 2028.

The logic is sound. But let’s look at what the stock prices did, one by one.

First: Lumentum—the direct product of the last bubble and the leader of this one.

This company specializes in lasers and optical components—in other words, the core "light source" within optical modules and optical communication systems. Its origins are particularly intriguing: its predecessor was one of the most famous stocks during the 2000 optical communications bubble, with a market capitalization that once soared to hundreds of billions of dollars. After the bubble burst, its value plummeted by 99%, becoming a textbook example of an infrastructure bubble. Lumentum emerged as a spin-off from that company.

Over the intervening two decades, it led a quiet life: supplying lasers for iPhone facial recognition and components for telecommunications networks, functioning as a typical "solid but boring" hardware company.

Then AI arrived. Data centers demanded massive quantities of high-speed lasers, and a new generation of technology—integrating optical pathways directly into switches—pushed it back into the spotlight, even drawing a $2 billion investment from NVIDIA. As a result: over the past 12 months, the stock price has risen more than tenfold.

Is the business improving? Yes, it truly is. Orders are booked through 2028—that’s concrete. But consider these two numbers together: its projected revenue growth is tens of percent annually over the coming years, yet its stock price has risen over 1,000% in just one year. The market is pricing it at dozens of times its annual revenue—while a mature hardware company typically trades at three to five times its revenue.

The epicenter of the last bubble's burst was light; the place where this bubble smells strongest is still light. History doesn't repeat itself, but it does rhyme.

Second: AAOI—Someone who fell once, now standing again on the same cliff.

The company manufactures optical transceiver modules, primarily selling them to cloud providers' data centers. Its history is equally intriguing: during the previous wave of data center construction (around 2017), it was once a top-performing stock—until its largest customer abruptly canceled orders and switched to other suppliers, causing the stock price to plummet by 90% over the next two years. For the following seven or eight years, it struggled on the brink of losses.

Then AI arrived, triggering a surge in demand for next-generation high-speed optical modules, and old customers returned. As a result, the stock price more than quadrupled within the year.

Note the difference between this company and Lumentum: Lumentum is at least an industry leader with a technological moat and NVIDIA’s backing; AAOI is a second-tier manufacturer that has been unprofitable for most of the past decade, heavily reliant on a few customers, and has already been burned by order cancellations in its previous cycle. Its surge is almost entirely due to the rising tide of the sector.

The tide has already begun to shift. Last month, this sector experienced double-digit daily declines more than once—AAOI dropped over 10% in a single day, and leading coins followed suit with drops of 7% to 10%. There was no fundamental negative catalyst; it was simply profit-taking as positions at elevated levels started to unwind.

Another layer of risk, rarely discussed: the technology roadmap itself.

The industry is currently undergoing an architectural revolution: integrating optical components directly into chip packaging, rather than leaving them as standalone modules plugged into switches—a shift known in the industry as co-packaged optics. If this approach becomes mainstream, it will mean two things: first, the "optical module" as a standalone product form will gradually be absorbed, with control shifting from module manufacturers to major chip companies; second, value along the supply chain will concentrate around the core light source, squeezing profits out of the assembly环节.

This technological shift presents greater opportunities than risks for companies like Lumentum, which control laser technology—light sources will always be needed, and they’re now more valuable than ever. But for module manufacturers like AAOI, whose strength lies in assembly, it’s a second blade hanging over their heads. Ironically, the market currently values both types of companies with nearly equal enthusiasm—when the tide is high, no one checks whether anyone is even wearing swimwear.

In the same sector, some are selling non-fungible light sources, while others are selling boxes that could easily be bypassed by architectural revolutions—yet their stock price increases show no distinction. This is itself a hallmark of a bubble.

Sum up the numbers for this layer: demand growth has approached 60%, while the stock price has risen four to tenfold. What accounts for the gap? The market has discounted 2028 revenues into the 2026 stock price.

The correct narrative, combined with excessive pricing—that’s the standard form of a bubble. Not fake, just priced so high that it leaves no room for error in the future.

Why is it precisely this layer that bubbles? The pattern on that map makes it clear: optical modules are the segment in the entire hardware chain with the lowest physical barriers to entry. Building a wafer fab requires tens of billions of dollars and five years; expanding an optical module production line needs only hundreds of millions and a few quarters—it’s the only hardware segment where supply can “keep up” with hype. With supply unable to be constrained, there’s ample room for a bubble to grow.

TSMC's control cannot protect the optical module—because the production capacity of optical modules is the only link in the entire chain that doesn't require TSMC's approval.

Repeated single-day double-digit drops indicate that smart money has already begun lining up at the door.

L3 Infrastructure Layer: GPU Cloud Sublandlord—Alive, but Surviving on Others’ Bottlenecks

Over the past two years, a new breed of cloud providers has emerged, specializing in GPU leasing: they buy GPUs themselves, build their own data centers, and rent out the computing power to companies lacking GPUs—industry insiders call them NeoClouds; we prefer to refer to them as "GPU sublandlords."

They perform exceptionally well and truly have the expertise: these people squeeze hardware like F1 drivers pushing race cars, achieving GPU utilization rates two to three times higher than those of traditional secondary suppliers. With the same batch of cards, they generate more revenue.

The logic also holds: the four major cloud providers simply don’t have enough capacity of their own, and the overflow demand must be absorbed by someone. As long as the overarching premise of “computing power shortage” remains, there’s business for sublandlords.

But keep in mind the nature of this business: they are beneficiaries of bottlenecks, not holders of moats.

Think clearly about their situation: every dollar they earn fundamentally comes from the timing gap caused by major tech companies’ production expansion lagging behind demand. But—power bottlenecks are expected to ease by 2027–2028; major companies are building their own data centers at the fastest pace in human history; and that earlier hint about space-based data centers, if realized in the 2030s, would undermine the very logic of scarcity in ground-based computing power.

The time gap will close. The sublandlord does not hold the property title, only a lease agreement with an unknown expiration date.

Moreover, this business has a structural weakness: extreme concentration of customers and critical supply chains. Their cards come from the same chip giant, major clients are often just two or three AI companies, and in some cases, the largest shareholder and largest supplier share the same name. The upstream controls your supply, the downstream controls your revenue, and you, in the middle, earn money from arbitraging time differences—this kind of business can be highly profitable, but it doesn’t warrant a “platform” valuation.

If you're profiting from someone else's bottleneck, be prepared for the day that bottleneck disappears.

This layer is not a scam; today's cash flow is real. But the market's current high valuation is pricing in the permanentization of a temporary state—that's a mispricing, heading toward a bubble.

Layer 4 application layer long tail + VC ecosystem: The area with the strongest bubble signals

Finally reach the top of the pyramid. This level should be viewed in two parts.

The top half—the few large model companies with genuine revenue—was mentioned earlier; their revenue keeps pace with their valuation, so we won’t elaborate further.

The real issue lies in the long tail and the VC ecosystem that fuels it. The most striking numbers are here:

In the first quarter of this year, AI companies captured the vast majority of global venture capital—more than $8 out of every $10 in VC funding went to AI.

In 1999, during the peak of the dot-com bubble, what was the ratio? Approximately one-third to two-fifths.

In other words, today’s venture capital concentration on single themes is twice as high as the peak of the largest bubble in human history.

Moreover, the structure is extremely lopsided: just four large头部 transactions accounted for 65% of global venture capital funding for the quarter. Two-thirds of the world’s quarterly venture capital went into the accounts of just four companies.

This has created a chain reaction: leading flagship companies are using real revenue to justify sky-high valuations—this is fine; but thousands of tail-end startups with no revenue are borrowing the valuation logic of the leaders to price themselves: “That company grew 80-fold in 18 months—why can’t I?”—and that’s the big problem. The 1999 version of “add a .com and your value spikes” has now become “add an AI agent and your valuation doubles.”

Worse still, the fate of these long-tail companies can already be foreseen. They won’t die from product failure—the products may even be solid. Instead, they’ll die from valuation inversion: the funds raised at bubble prices in their last round have been burned through, and new investors are only willing to pay realistic valuations. But financing at a realistic valuation means the previous investors suffer massive losses and the founding team’s equity is wiped out—leading to a breakdown in negotiations, as the company is stuck between preserving “valuation dignity” and simply surviving, until its cash runs out. Most of the companies from 1999 died this way: not killed by the market, but choked by their own previous valuations.

Another amplifier: The cost structure of today’s long-tail companies is more fragile than it was in 1999. Back then, internet startups burned cash on marketing expenses—cutting ads could still keep them afloat. Today’s AI startups burn cash on compute bills—without model inference, their products grind to a halt, and this cost cannot be cut. Revenue is a story; costs are rigid. This combination will lead to a faster demise during a capital retreat than the previous cycle.

Note that this does not contradict the idea that "large caps have no bubble"—

The top players have real revenue backing them; the long tail has only stories to rely on. Bubbles never exist in the largest companies—they exist in smaller companies that price themselves using the valuation logic of the largest companies.

Do you remember what the real lesson of 1999 was? It wasn’t “the internet is fake”—the internet is real, e-commerce is real, and the biggest e-commerce company survived and went on to dominate the world. The lesson was:

In a real technological revolution, you can still lose all your money—if you invest in the wrong layer.

Shorts aren't entirely wrong: two attack lines worth reflecting on before bed

If you’ve gotten this far and think we’re blind bulls, keep reading. There’s real substance on the bear side—and this time, that substance is sharper than most bulls are willing to admit.

Short sellers have two main fronts. On the surface, they appear to be two separate issues, but dig deeper, and you'll find they are two sides of the same problem.

Attack Line One: The Depreciation War—How Many Years Can You Really Expect Your GPU to Last?

Let’s start with a simple, everyday example to explain "depreciation."

Suppose you drive for a ride-hailing service and spent 300,000 to buy a car. If you depreciate the vehicle over 3 years, your annual cost is 100,000; if you depreciate it over 6 years, your annual cost drops to 50,000. Note: You haven’t earned a single additional dollar, and the car is still the same—only the accounting assumption has changed, yet your reported profit increases by 50,000 per year out of thin air.

Now replace the car with a GPU, and replace 300,000 with hundreds of billions of dollars.

Tech giants are collectively doing the same thing: extending the depreciation period for GPUs. Previously standardized at 3–4 years, they are now extending it to 5 or even 6 years. Extending depreciation by just one year significantly improves reported profits. Short-sellers estimate that, under this change, the entire industry could avoid recording over $100 billion in depreciation over the next three years, potentially inflating the current profits of major players by more than 20%.

What does twenty percent mean? It means that one-fifth of the profit you see in the financial statements may be a gift from accounting assumptions, not earned by the business itself.

The bulls' counterargument also makes sense: the depreciation period wasn't arbitrarily changed. In inference scenarios, older GPUs are still fully capable—while training cutting-edge models requires the latest cards, using GPUs from three years ago for routine inference still operates at full capacity and remains profitable. By this logic, it’s not unreasonable to expect GPUs to last 10 or even 15 years; previously depreciating them over just three years may have been an underestimate.

Who is right? The honest answer is: it depends on NVIDIA itself. The more dramatic the performance leap in the next two generations, the faster older cards depreciate, and the more correct the shorts become; the more gradual the leap, the longer older cards remain viable, and the more correct the longs become. With every new product release, NVIDIA is effectively casting a vote on its customers’ balance sheets.

This is the most ironic moment in AI finance: the more successful NVIDIA’s products are, the more suspicious its customers’ financial statements become.

Attack Line Two: GPU Credit—Moving Debt Out of Sight

The second attack vector has been updated and is now more stealthy. Few people are discussing it publicly, but we believe it is an order of magnitude more severe than the depreciation issue.

GPUs have already begun circulating through complex off-balance-sheet structures. Breaking it down, this structure operates as follows:

Set up a shell: Establish a special purpose vehicle (SPV)—a shell company whose sole purpose is to hold GPUs.
Shell borrows money: The shell company takes out a loan from a private credit fund to purchase tens of thousands of GPUs.
Renting to card users: The shell company long-term leases GPUs to AI companies, collects rent, and uses the rent to repay loans.
The card sellers join in: The best part is this step—the chip manufacturers themselves invest money into the shell company, becoming anchor investors.

Each party got what they wanted: the AI company obtained the chips without taking on debt; neither the tech giant nor the AI company recorded this liability on their balance sheets; the chip manufacturer secured sales volume and earned additional investment returns; and the private credit fund acquired a high-yield asset.

Four-way win. There's just one small issue: the debt hasn't disappeared—it's just invisible.

This structure should remind you of something. In fact, it resonates with two historical periods at once.

The first paragraph dates back to 2000. Few remember that during the telecom bubble, a key enabler was "vendor financing": equipment giants lent money directly to their customers so those customers could buy their equipment. On paper, sales soared and growth curves looked perfect—but in reality, it was just moving money from one hand to the other: customers used the company’s own funds to purchase its products. When the bubble burst, these equipment vendors didn’t hold profits—they held a pile of unrecoverable receivables, and they died harder than anyone else. Today’s structure, where chip manufacturers invest money into shell companies that then use that money to buy chips, is a direct cousin of that same vendor financing model.

The second paragraph is from 2008. The last time the entire financial system was obsessed with "packaging, layering, and moving risk to places where regulators and investors couldn’t see it" was during the pre-crisis securitization of mortgages. Back then, houses were packaged; now, it’s GPUs.

When an industry starts paying its own customers to buy its products, every growth figure you see should be questioned.

Depreciation is an accounting issue, and accounting issues have never burst bubbles; leverage is a financial issue, and historically, every bubble has been burst by financial issues.

The two lines are actually one line.

Now, connect the two attack lines, and you'll see the true killing power of the bearish logic.

The essence of the depreciation dispute is: How many years can a GPU be used, and what is its residual value?

What is the collateral for GPU credit? It is the residual value of the GPU.

In other words, the basis for shell companies borrowing tens of billions of dollars is the assumption that "this batch of GPUs will remain valuable and continue generating rental income for many years to come." If NVIDIA’s next-generation products perform another leap forward, the rental value of older cards could plummet—first to collapse wouldn’t be the giants (who can withstand the shock), but these shell companies and the private credit funds that lent them money.

Then the question you’d ask becomes: How much has private credit expanded over the past few years, and how much else has been stuffed into it? That’s a topic for another article.

The current scale of this structure is still small and far from being systemically concerning—that’s the truth. But even the most ardent bulls have listed “large-scale leverage via GPU collateral financing” as the top risk signal of this cycle. When both bulls and bears unusually point to the same place and say, “Look there,” it’s a sign you should pay close attention.

The moment GPU was stuffed into the table shell company, 2026 first smelled faintly of 2008. It’s still just a hint—watch how quickly it intensifies.

Conclusion: Expensive, but the door is still locked.

Condense the entire text into one image—the same pyramid:

No bubble (L0 + L4 leaders): TSMC, NVIDIA, the four major cloud providers, and leading large model companies. Real contracts, real revenue, full utilization rates, plus the two physical locks of TSMC and the power grid. Expensive, but expensive does not equal a bubble.

Long-Short Squeeze (L1): Memory. A 70% profit margin is either the beginning of a new structural cycle or the climax of the old script—the table is set.

Bubble-like (L2, L3, L4 long tail): Optical modules—the only link in the entire hardware chain not protected by TSMC’s capacity discipline, pricing 2026 based on 2028 revenue; GPU sublandlords—mistaking temporary bottlenecks for permanent moats; VC ecosystem—concentration in a single theme reaching twice the peak level of 1999, with long-tail startups using top-tier valuation logic to price their narratives.

Three potential red flags to watch for:

The algorithmic efficiency revolution. If one day, smarter algorithms achieve the same results with just one-tenth of the computing power, the entire capital expenditure logic of "throwing compute at the problem" could collapse overnight. This is the least likely scenario—but the most devastating.
GPU credit leverage. Once off-balance-sheet structures, collateralized financing, and securitization are rolled out, cash flow buyers become leveraged buyers, and the 2000 script reenacts itself with the 2008 engine. This is currently the most realistic emerging trend.
TSMC has abandoned conservatism. Whether it’s due to competitors breaking its monopoly or its own decision to aggressively expand production—the moment supply spirals out of control, the essential condition for a bubble is truly met. This is the most critical factor to monitor over the long term.

Before any of these three things happened, AI was a technological revolution whose pace was forcibly constrained by physical laws: expensive, crowded, locally overheated, but with a solid foundation.

Finally, turn this map into three portable questions. Next time you see any AI-related asset—whether a stock or a startup—ask:

First question: Which level of the pyramid is it on? The closer to the physical, the more solid; the closer to the story, the more dangerous. If you can’t clearly define your level, assume it’s on the most dangerous one.

Second question: Is its revenue genuinely generated, or is it borrowed from the valuation of top companies? The frequency of the phrase “comparable to [X company]” is directly proportional to the level of speculation.

Third question: Is it making money from structure or from a bottleneck? Money from structure can be earned for many years; money from a bottleneck has an expiration date—and that expiration date is usually much shorter than the time implied by its valuation.

Answer all three questions first, then we can discuss the price.

Bubbles never notify you at which level they burst. But you can at least choose not to stand on the level where your value is determined by others' stories.

Next time someone asks you, "Is AI a bubble?" you can ask them back: "Which layer are you referring to?"

The group of seventy-something engineers at TSMC may be the only ones on this planet capable of stopping the AI bubble. So far, they are still on the job.