ByteDance's 2026 AI strategy focuses on four key areas: world models, the video model Seedance, coding, and the commercialization of Doubao.
Author: Zhou Xinyu
Source: 36Kr
Intelligent Emergence has learned exclusively from multiple sources that in 2026, ByteDance AI will have four key initiatives:
Increase investment in training world models, and by the end of the year, achieve model performance on par with the current global SOTA, Google Genie 3.
The video model continues to lead the way, exploring new directions such as "dynamic generation."
Strengthen the foundation of coding, implement effective coding dogfooding (data feedback and evaluation to create a flywheel), and enhance agent capabilities.
Doubao enhances its commercialization capabilities, with a focus on the "office" use case.
ByteDance’s Uncharted Territory: World Models Today, ByteDance’s AI portfolio includes Seed 2.0, which has finally propelled ByteDance into China’s top tier of large models, and Seedance 2.0, which has achieved state-of-the-art (SOTA) performance globally. On the application side, Doubao has established a decisive lead—we’ve learned from multiple sources that Doubao’s daily active users reached 200 million shortly after the 2026 Spring Festival.
“There are no obvious weaknesses,” said an AI strategist from a major tech company, evaluating ByteDance’s AI business portfolio.
However, among all these models, the key to the next stage of large model research—world models—is missing.
Several individuals close to the Seed team told us that ByteDance entered the world model赛道 relatively late. In 2024, Zhou Chang, who had just joined ByteDance from Alibaba, took the lead in world model research.
At the time, the internal assessment was that the path for world models and commercial applications was still unclear; more importantly, the priority was to win the battle for video models.
Until 2025, ByteDance established a small research team to explore the VLA (Vision-Language-Action) approach within world models. The team was led by two individuals:
First, Li Hang, head of ByteDance AI Lab—in April 2025, the entire AI Lab (including the Robotics team) was integrated into Seed, one of the goals being to improve communication efficiency between models and applications (embodied intelligence)—primarily training world models using simulated data.
The other is Wang Wenqian, a Seed multimodal researcher who primarily trains on natural data.
By 2026, Wu Yonghui finally set a clear goal for the world model at Seed’s all-hands meeting: release at least one version of the world model by the end of 2026, with performance comparable to the current world SOTA—Google’s Genie 3, released in August 2025.
However, based on the current progress, the pace of catching up is insufficient. A person close to Seed told us that Wu Yonghui has repeatedly stated internally at Seed that ByteDance’s world model and embodied intelligence have underperformed expectations.
Another Seed member revealed that, according to internal evaluations, as of early 2026, the ByteDance world model's overall performance is still 10% behind the global SOTA.
But this battle represents the future.
On one hand, the downstream applications of world models include the embodied intelligence market, valued at least in the hundreds of billions of dollars, as well as highly imaginative scenarios in gaming and entertainment.
A former Seed researcher told us that ByteDance’s robots were previously primarily deployed in item transportation and industrial handling, but internally, the company viewed these applications as having limited potential. “Humanoid robots represent a much broader market opportunity, and this is definitely a direction ByteDance will enter.”
On the other hand, there are still many non-consensus directions in world models, including the video generation camp, the VLA (Vision-Language-Action) camp, the JEPA (Joint Embedding Predictive Architecture) camp, and others.
“Given the talent density and capital investment in bytes, you’re likely to win,” an AI investor analyzed for us. “If you don’t bet, you’ll definitely lose.”
In pursuit of the goal of entering the world's top tier, ByteDance has made numerous adjustments to world model training since 2026.
According to "Intelligent Emergence," after the 2026 Spring Festival, Seed established a new research group focused on world models, led by Fan Haoqi, a former researcher at Meta FAIR Lab, reporting to Zhou Chang, Head of Multimodal and World Models at Seed.
Meanwhile, the two VLA research teams led by Li Hang and Wang Wenqian have been merged and now report uniformly to Zhou Chang.
Several informed sources told Smart Emergence that Li Hang and Wang Wenqian’s research group primarily pursued the VLA approach, emphasizing “improvisation” and “realism,” with target applications in embodied intelligence; in contrast, Fan Haoqi’s new team is focusing on a 3D simulation route, targeting entertainment and gaming applications.
In addition to human resources and expansion of exploration pathways, the world model also receives the highest funding investment among various model directions, including text, coding, and video.
The data budget is notably significant. An employee from ByteDance’s data platform told us that the “volume-driven” strategy for training data, which previously yielded significant gains for LLMs and Seedance 2.0, will now be applied to training world models using the same “data ocean” approach.
This also corresponds to higher data investment—we have learned from multiple sources that in 2026, ByteDance allocated the highest budget among all modalities for training data for world models, including VLA, long-form video, and 3D, amounting to tens of millions of yuan.
A data supplier mentioned that ByteDance's data investment in world models can reach three to four times that of other manufacturers.
Coding: Pursuing superior data engineering coding skills is fundamental and the key determinant of an agent’s performance ceiling—a consensus in the industry.
Several insiders have mentioned to us ByteDance’s emphasis on Coding. “ByteDance has consistently invested heavily in Coding, second only to its world model this year,” said someone close to Seed.
For example, internally, we may procure data selectively or study training data demos from top overseas coding models such as Claude Code and CodeX.
At the 2025 Volcano Engine Force Conference, Hong Dingkun, Vice President of Technology at ByteDance, also stated that coding, as a highly structured and logically rigorous task, places significant demands on a model’s ability to understand complex semantic structures, perform logical reasoning, design algorithms, and express ideas precisely, thereby helping to explore the upper limits of model intelligence.
However, outside the company, ByteDance’s Coding business has had limited visibility. Both the Doubao-Seed-Code model released in November 2025 and the AI programming tool Trae launched earlier in 2025 have underperformed in terms of impact and recognition compared to Zhipu’s GLM 5 and Moonshot’s K2.
A knowledgeable insider commented, "The reason ByteDance has struggled to make breakthroughs in coding efficiency is the lack of data feedback." Due to limited model capabilities, ByteDance's related businesses are unwilling to use Seed-Code.
Even the AI coding app Trae initially integrated DeepSeek and Claude Code, along with its own internally trained coding model.
This results in the ByteDance coding model lacking feedback from real-world use cases.
Since 2026, many ByteDance employees have sensed that various business units are increasing their support for the Seed model. A Seed employee told *Intelligent Emergence* that previously, ByteDance did not restrict business teams from using third-party coding models for development, but since 2026, multiple application departments have been required to use the Seed model exclusively.
However, with even greater data investment, Seed has slightly slowed its pace of talent acquisition.
An AI headhunter told "Intelligent Emergence" that ByteDance's HR is currently signaling that the era of broad, high-salary hiring has ended; the new focus is on internal talent development and promotion of young talent, along with improving algorithm compensation.
Currently, Seed’s few hiring openings are primarily targeted at AI talent from overseas giants such as DeepSeek, OpenAI, DeepMind, and Meta, including former core members of DeepSeek like Guo Daya and former NVIDIA researcher Dong Xin.
In 2026, ByteDance's other key focus on AI models will be maintaining Seedance's SOTA position in global video generation.
"The victory of Seedance 2.0 is a victory of data," said the founder of a video generation startup to *Intelligent Emergence*. We learned that Seedance 2.0’s outstanding performance was achieved through massive training data and a review team of over 2,000 people.
However, relying continuously on a "quantity-driven" training approach also carries risks. Some studies have identified an "Anti-Scaling Law" phenomenon in video generation: simply put, the more training data used, the more likely the model is to "take shortcuts," learning only key frames while neglecting the full narrative—thus, the returns from increasing data volume tend to diminish significantly in later stages of training.
Two insiders on the data side told us that Seedance has reached the upper limit in pre-training; to further improve performance, it must clean its training data and conduct more refined post-training.
Meanwhile, the "dynamic generation" capability is a new focus area for the Seedance team in 2026.
所谓的“dynamic generation,” also known as interactive video, refers to users being able to input commands to adjust the content and plot of a video in real time. In this space, Vivix AI—founded by Liu Yu, former senior research director at SenseTime—has already achieved a valuation of $1.32 billion.
Multiple informed sources told Smart Emergence that Zhou Chang has consistently been optimistic about the practical applications of dynamic generation.
“Interactive videos can be turned into mini-games or interactive series, and they can also connect with explorations in world models (video generation is one pathway in the exploration of world models),” said a person close to Seed.
Accelerating DouBao's commercialization and international expansion. 36Kr exclusively reported that DouBao is expected to officially launch paid content in late June; meanwhile, DouBao is also planning to integrate with TikTok Shop to enhance its paid scenarios.
In early May 2026, Doubao updated its paid subscription options on the App Store, with monthly subscription prices ranging from free to 500 yuan.
On June 3, Doubao officially announced that it will soon launch "Doubao Pro," tailored to the productivity needs of professionals, offering services in software development, data analysis, professional design, process automation, financial analysis, and scientific research.
Multiple insiders revealed that after the Spring Festival, Doubao's DAU has surpassed 200 million. "This year, Doubao's advertising budget is very low," according to one insider. The high DAU brings substantial inference costs and operational pressure; thus, Doubao's push toward commercialization at this stage aims to moderate its growth rate and achieve self-sustaining revenue generation.
PPT generation is the core entry point for豆包 to establish user willingness to pay. “DouBao aims to enhance its PPT generation feature to target white-collar professionals in high-value industries such as finance and law,” a person close to DouBao told Smart Emergence. In the next phase, DouBao plans to launch an enterprise version integrated with internal corporate systems, though the specific integration methods are still under internal discussion.
He said this idea was inspired by overseas business models. The commercial path of charging for office scenarios has already been validated abroad. According to data disclosed by Anthropic, Claude Code reached an annual recurring revenue (ARR) of $1 billion within just six months of launch, and by February 2026, one year after launch, its ARR had grown to $2.5 billion.
The significant cash flow generated by Claude Code for enterprise development scenarios enabled Anthropic, founded six years after OpenAI, to surpass OpenAI's ARR at the beginning of this year.
Now, Doubao's goal is to shift users' perception of it from a free, general-purpose entry point for asking anything to a paid office assistant that helps improve efficiency.
However, the market that Doubao aims to enter is already crowded. A Doubao insider told Smart Emergence that during research with enterprise clients, ByteDance found that the enterprise AI tools market has already been captured by numerous industry-specific AI solution providers, meaning Doubao’s late entry will inevitably face higher customer acquisition costs.
Intelligent Emergence learned that going global is also one of DouBao's key priorities this year.
Previously, the overseas version of Doubao, the Dola app, surpassed 10 million DAUs by the end of 2025. According to Intelligent Emergence, Dola’s growth target for 2026 is to reach 30 million DAUs by year-end.
A source said that non-English-speaking countries are Dola’s primary target markets. Currently, the overseas AI chatbot market is largely dominated by ChatGPT, Claude, and Gemini. Dola’s growth strategy involves avoiding direct competition with the “Big Three” AI players in欧美 markets and instead differentiating itself by entering non-English-speaking markets.
Third-party data shows that since the second half of 2025, Dola has frequently appeared on the download charts of app stores in Indonesia, Malaysia, Mexico, and other countries.
——
Since joining ByteDance a year ago, Wu Yonghui’s mission has been to lead Seed in fixing bugs while building state-of-the-art models. In 2026, across every front in AI, ByteDance aims to be the winner.
Today, Seed 2.0 and Seedance 2.0 are showing early results, and the engineering, data, and talent expertise accumulated by Seed will be reused more efficiently in the next round of competition.
(Yingyi Deng, author of "Intelligent Emergence," also contributed to this article.)
