Vertical AI startups navigate survival amid dominance of general-purpose models

Intelligence begins to grow nonlinearly, and the fundamental logic of AI companies is being rewritten.

Author and source: GeekPark

90% is the probability investors have assigned to AI startups failing by 2026.

In April, Yupp, an AI model evaluation platform backed by a16z with a $33 million seed round, suddenly announced its shutdown. Despite endorsements from prominent Silicon Valley figures including Google’s Chief Scientist Jeff Dean and Twitter co-founder Biz Stone, and having attracted 1.3 million users in less than a year since launch, the founders abruptly decided to close the platform. Although substantial funds remained on the books, the founders saw no path forward. “Over the past year alone, the landscape of AI model capabilities has changed dramatically; the future is not just about models, but about agent systems,” wrote Yupp’s founder, Pankaj Gupta, in his farewell blog post.

During the same period, the AI image company NeuroPixel shut down due to the sudden leap in capabilities of large models like Google NanoBanana Pro. The founder of NeuroPixel used one word to describe this defeat: outgunned—“overwhelmed overnight with no ability to fight back.”

Against the backdrop of foundational models achieving intelligent, stepwise improvements, the boundaries of AI capabilities continue to expand. Initially, chatbots replaced search engines, eliminating the need for users to scroll through pages of results. Then, agents began replacing software—intelligent agents capable of invoking tools and breaking down tasks can now accomplish what previously required entire menus and multiple apps. When AI can directly write code, call APIs, and execute tasks on terminals, the boundaries of traditional software systems are being redefined.

For product managers, the focus is on redefining the product’s form and interaction methods. For founders, the question of survival is now at hand:

As foundational models grow increasingly intelligent, how should I start a business? How can I ensure that what I’m doing now won’t be instantly overtaken by the next model update?

Shi Yi, founder of FlashLabs, has spent the past year grappling with this very question. He made a series of decisions that seemed counterintuitive to outsiders: overhauling the product roadmap, voluntarily downsizing the team, abandoning short-term commercial metrics, and even changing the company’s name. We spoke with him about how vertical AI startups can survive in an era where general-purpose models are rapidly evolving.

01 Renamed, Streamlined, Shifted to AI-Native: A Survival Transformation Forced by Large Models

The sense of crisis was not new to the founder today. Back at the end of 2024, Shi Yi realized that the intelligent evolution of general-purpose models was progressing too rapidly.

What first made him sense something was off was the downfall of Jasper, an AI unicorn. This once-celebrated company, regarded as a benchmark in AI applications, had surged to a $1.5 billion valuation in just 18 months—only to see its revenue halved after GPT’s native capabilities became widely available. “Jasper’s ARR dropped by half,” Shi Yi recalled. “Companies that were focused on NLP are being absorbed by large models as their capabilities continue to improve.”

This judgment was like a thorn stuck in his heart, causing a quiet unease. At the time, his company was still called FlashIntel and operated in a relatively traditional B2B SaaS business. According to conventional B2B SaaS logic, as long as you accumulated sufficient industry data in a sufficiently niche area and built technological barriers that were compliant and secure, there would inevitably be room to survive in the market—but now, all of that no longer held true.

“Could what I’m doing also encounter the same problem?” This question began to repeatedly surface in his thoughts. Soon, he realized that what he was doing was essentially no different from Jasper’s—his past product architecture had been built on the assumption that general models would never surpass specialized models. Once the foundational model’s intelligence crosses a certain threshold, all the engineering and scenario optimizations built on top of specialized products could lose their advantages overnight.

With this conclusion, he elevated this critical issue to the top priority of the company’s strategy, forcing the team to make a decision: the company must fully transition from SaaS to AI-native.

This adjustment doesn't happen overnight. His first follow-up question was: What kind of organizational structure does the next-generation AI company really need?

He believes that running a company today can no longer focus on team size or fine-grained specialization. “In the AI era, the more people you have, the worse you tend to use AI—because the more specialized the roles, the more each person relies solely on their own narrow slice.” He began proactively reducing team size, shifting hiring criteria entirely from “experience and past projects” to “thinking mindset and full-stack capability.” His candidate evaluation method also changed: instead of reviewing past resumes or experience, he now gives candidates real tasks to see if they can use AI to handle both frontend and backend work independently. “If someone can get it done, they won’t be bad at using AI tools.”

Immediately afterward, he reallocated internal company priorities. While most startups were focused on speeding up product launches and validating commercialization, he chose to direct the majority of resources toward cutting-edge research, even renaming the company FlashLabs.

“In the past, the internet followed a product- or operations-first approach, but now with AI, it must be research-first.” He urged himself and his team to read papers and understand first principles: “Only by getting closer to first principles can you understand what AI will still be capable of and what it can still replace.”

This transformation also brought about an internal "period of adjustment," as not everyone on the team fully understood this major structural shift. When he told the team, "Don’t think about monetization yet—just build something cool," some employees were excited, while others chose to leave. But he remained firm that in the AI era, doing less is more important: "If you don’t believe in it, you’re out."

But more importantly, what kind of founders will survive in the AI era?

Shi Yi’s response is divided into two halves: the first half addresses reality—“At least you can raise funds; as long as you’re not dead or your pockets are deep enough to keep injecting capital.” The second half is what he truly wants to say—“Do you have a deeper thinking ability stronger than AI?”

"Why can large models do more and more things? Because the essence of all natural sciences is mathematics, and these models can write code and understand math. If you peel back this chain layer by layer, the only truly scarce human ability left is thinking deeper than AI in a specific domain," Shi Yi analyzed. "Many people simply don’t understand AI well enough—how many founders actually write code themselves or use AI tools daily? Coding ability will soon become a commodity that everyone will have. But can you be smarter than AI? That’s the real moat."

From recognizing the crisis, to making decisions, to paying the price for organizational restructuring, Shi Yi spent a year undergoing a process of “self-iteration.” He didn’t wait for model updates to reveal the final outcome; instead, he chose to proactively seek out where the right answer might emerge. Whether he’s standing in the right spot is another question—but for now, he has no intention of stepping away from the AI table.

Enterprise-grade agents must play the "Harness" card.

The reorganization of the structure is only the first step on the path to corporate survival. What truly required Shi Yi to make a determined change was the product roadmap.

He initially wanted to build a multi-agent collaborative system, following the logic that "many hands make light work," by mimicking the organizational structure of a human company to create a multi-agent system: some responsible for searching, others for logical reasoning, and still others for consolidating results.

But the actual test results made Shi Yi shake his head repeatedly: “Too slow, too laggy—the output is even worse than a single agent.” In his view, the instruction transfer between agents is like a poor-quality game of telephone, where each additional layer of relay causes more information loss. “I’d rather have one genius with an IQ of 150 fully equipped with top-tier tools than a bunch of average people with an IQ of 110, holding incomplete tools and constantly needing to consult each other,” Shi Yi frankly stated in the interview.

In the end, he eliminated all predefined sub-agents and decided to build a single, powerful agent that uses multi-threaded parallel execution to replace cluster collaboration.

This is also the prototype of FlashLabs' latest product, Super Agent, pushing a single model’s intelligence to its limit and equipping it with the most advanced tools. Super Agent leverages intelligent automation to unify users’ revenue systems, with AI Agents involved in every stage—from lead generation to conversion.

At the interview site with Geek Park, Shi Yi assigned Super Agent an information retrieval task: “Retrieve the backgrounds of founders of all AI companies in China that received funding over the past six months and output a table.” Subsequently, Super Agent simultaneously launched dozens of task threads to conduct searches, scrape data, write code, and clean the data, delivering the results within 2–3 minutes. The table included the founders’ names, funding amounts, and publicly available contact information.

If abandoning multi-agent is a subtraction at the architectural level, then abandoning localization is a counterproductive choice in deployment logic.

While OpenClaw sparked a "local Agent" trend in the developer community, Shi Yi firmly placed the Super Agent in the cloud. “A system like OpenClaw running inside a corporate network is essentially a Trojan horse—you can easily infiltrate it through this channel.” He believes that any company daring to deploy OpenClaw at scale internally at this stage is effectively opening its doors to hackers worldwide.

In his view, OpenClaw’s advantage lies in demonstrating the potential for proactivity on the individual level. For example, with OpenClaw, if an AI asks a user for $2,000 to buy a graphics card, and the user replies, “Go earn it yourself,” the AI will go analyze market trends and develop quantitative strategies. “Which boss doesn’t like proactive employees?” Shi Yi retorted. When this proactivity becomes part of an enterprise-grade product, the pace at which it replaces human workers will far exceed expectations. “Back in the Industrial Revolution, when horse-drawn carriages turned into cars, you had to buy a car, learn to drive, and modify the roads—it took a long time. This time is different: with managed deployment, in an instant, the work of dozens of employees disappears.” He also predicts that white-collar jobs will be significantly replaced by AI this year.

Regarding the challenge of automated execution—specifically, how to ensure enterprise-grade security—FlashLabs addresses this by building a sandbox permission system similar to macOS, deployed in the cloud with progressive authorization. This means the Agent initially holds only the minimum permissions necessary to complete its task, and its operational boundaries are gradually expanded only after multiple verifications of stability and security.

He used Windows and Mac as examples: “On Windows, installing software can grant extremely high privileges—silent installations, browser bundling, making it nearly impossible to remove. On Mac, all programs are sandboxed, so you never need antivirus software.” Shi Yi believes that competition among enterprise-grade agents will ultimately extend from model invocation capabilities to environmental design—those who can provide agents with a secure, controllable, and auditable runtime environment will be the ones customers truly dare to use.

But if the model leaps forward again, will today’s adjustments still matter? If GPT-6 or Claude incorporates significantly stronger task decomposition and tool-calling capabilities, won’t everything FlashLabs is doing today be rendered obsolete once more?

Faced with this follow-up question, Shi Yi did not avoid it; his thinking was divided into two aspects.

He first categorized the enterprise barriers of vertical industry companies into four levels: Perception, Planning, Recursive Learning, and Governance.

There are five major companies in the large model market, and the SOTA rankings change every three months. Through the orchestration layer, you can integrate all models and invoke the one best suited for each specific scenario. However, a single-model company can only use its own model—if your foundational model isn’t the most capable, your product’s competitiveness immediately suffers. As general-purpose large models rapidly dominate the first two layers, Shi Yi believes the only true barriers now lie in the last two layers, with the ultimate moat residing in the orchestration layer.

He believes that when multiple agents collaborate within enterprise systems, they may privately negotiate out of human sight, bypassing predefined permission rules. The true barrier for vertical companies lies in their ability to design an operational environment that is both open and controllable for specific scenarios.

Regarding the accuracy of this judgment, he admitted he isn’t 100% certain. “AI is changing too rapidly—you truly can’t predict what the future holds.” But he is confident of one thing: as long as vertical enterprises effectively leverage AI orchestration and AI governance, and properly address environmental design challenges, they won’t be immediately ousted from the table during the next wave of model advancements.

The voice model will undergo a restructuring, and proactive agents may give rise to a new pay-for-performance paradigm.

Now that you know how to build a competitive product, the next step is to gain customer recognition.

At this stage, Flashlabs has two primary commercial products: Super Agent, which is priced based on token usage with pricing details available on the official website; secondly, they have open-sourced their Chroma voice model but charge for platforms and services built on top of the model. In fact, these two approaches are among the most common commercialization strategies today—using open source to build technical trust and generating commercial value through platforms and services.

Currently, Japanese tax and accounting firms are replacing human customer service agents with FlashLabs' Chroma voice model, testing with just one-tenth of their staff. AI and human agents are operating simultaneously, with ongoing performance comparisons. The validation process is straightforward: whichever achieves higher accuracy and better efficiency wins, based purely on data.

“The usage boundary of voice is on the same scale as vision.” While the entire industry is focused on multimodal and video understanding, Shi Yi and his team relentlessly pursued the real-time speech model Chroma, achieving an end-to-end latency of just 135 milliseconds.

Before large text models emerged, there were OCR, NLP, and various small models stitched together. Speech is now in the same state as text was before large models arrived—ASR, TTS, and numerous modules are pieced together, with each component undergoing localized optimization. This legacy architecture will eventually be entirely replaced by an end-to-end speech large model. His judgment is that rather than waiting for others to do it, he should be the one to build that replacement first.

Shi Yi believes that speech is the most natural mode of communication between humans, and it will inevitably become the core interface for human-AI interaction. “Speech carries a much wider bandwidth of information than text—I say one sentence, and you immediately understand.”

He also believes that voice models play a crucial role in advancing the embodied intelligence industry. The first layer is a real-time voice model responsible for low-latency, emotionally intelligent immediate responses—such as answering questions about the weather or whether to wear more clothes—handled directly. The second layer is a deep reasoning large model that handles complex inference. The third layer is a world model that understands physical laws. “The scope of voice usage is on the same scale as that of vision,” is one of his most confident long-term convictions.

Shi Yi also believes that the current AI commercialization model is merely a transitional phase. This is because all current agents are essentially passive responders—they do only what you tell them to do, functioning like tools waiting for instructions, much like chatbots. As a result, the business model still relies on pay-per-token usage: you pay for what you use.

But when an agent begins providing proactive services—meaning you tell it what your KPIs and OKRs are, and it independently takes initiative, plans its own path, and delivers measurable outcomes—it is no longer being evaluated as a tool, but as an employee. Clearly, companies don’t pay employees based on how many words they type or how many emails they send; instead, they assess what goals they’ve accomplished.

Therefore, he believes that as we enter the agentic era, the business payment model should shift to pay-for-performance and pay-for-KPIs. When this shift truly occurs, the entire pricing structure, sales approach, and customer relationships for agent products will be rewritten.

New business models are already emerging at the heart of the industry. Crosby, an AI-powered law firm that recently secured $60 million in Series B funding, assigns different tasks to specialized agents—such as extracting background information, suggesting revisions, and generating annotations—while human lawyers review the AI’s outputs, address any oversights, and ensure accuracy. Its business model charges clients based on the number of audited contracts, ranging from $250 to $1,000 per contract, roughly $10 to $50 per page depending on length.

But the real prerequisite for evolving to the next commercial model is that proactive agents can consistently deliver measurable results. “We’re not there yet.”

From FlashIntel to FlashLabs, Shi Yi completed a clearly defined organizational and strategic shift within a year—laying off staff, dismantling the existing product architecture, and temporarily slowing down commercialization efforts—all actions that, to outsiders, appeared to be a continuous process of reduction.

But in the context of the AI industry’s rapid evolution, this is more like a startup recalibrating itself amid intense change. Model capabilities may leap forward every few months, and no one can fully predict the future. For Shi Yi and FlashLabs, the current priority isn’t capturing market share, but ensuring their technological choices and business logic remain resilient against the next wave of disruption.

The industry is still exploring the true form of agents, with the final models for payment structures, security boundaries, and interaction modalities yet to be determined. FlashLabs’ approach may not be the optimal solution, but it represents a realistic survival path for vertical AI companies: under mounting pressure from large models continuously penetrating deeper into the market, first secure a foothold, then wait for the industry to mature.