China's First Open-Source Embodied Data Collection System, XRZero-G0, Is Released

iconMetaEra
Share
AI summary iconSummary
The autonomous robot open-sourced China's first embodied data collection black box system, XRZero-G0. This project integrates a complete pipeline for body-free data collection, quality inspection, training, and real-robot evaluation, accompanied by a multimodal dataset of over 2,000 hours covering 3,000 tasks. The core solution involves operators wearing VR headsets and multiple cameras to capture movements, eliminating the need for a physical robot on-site. The system ensures data quality through three security checks—three-camera viewpoint verification, virtual IK limit validation, and real-robot playback—achieving a data validity rate exceeding 85%. Experiments show that training with a 10:1 ratio of body-free to real-robot data yields performance equivalent to 500 purely real-robot data samples, reducing collection costs to one-twentieth of the original. The system also supports zero-shot cross-body transfer, addressing the challenge of body differences in robot deployment.

Article author and source: Leiphone

The embodied AI industry has recently been flooded with news about an open-source project.

It originally started as a whisper in a small circle: “Someone has open-sourced an entire embodied dataset.” I took a look out of curiosity, but the more I examined it, the more it became clear—this wasn’t just a simple dataset; it was an entire ontology-free data collection system.

In other words, while others open-source snippets of code, this release provides a complete end-to-end pipeline for body-less data collection, quality inspection, training, and real-device evaluation—along with a comprehensive multimodal body-less dataset spanning 2,000+ hours and covering 3,000 tasks, all fully packaged and made available.

First in China! The Embodied Data Collection "Black Box" is now officially open source—ending the era of expensive embodied data.

First in China! The Embodied Data Collection "Black Box" is now officially open source—ending the era of expensive embodied data.

Paper URL: https://arxiv.org/abs/2604.13001

This was unprecedented in China, so I thoroughly researched the corresponding paper:

In simple terms, the XRZero-G0 paper does two things: first, it opens the "black box" of robotic data collection and demonstrates step-by-step how to acquire a high-quality dataset at an ultra-low cost; second, it provides a hands-on guide on how to train models using the data.

First, let’s talk about data collection. Many of you may have heard the saying that “collecting data for embodied AI is difficult and expensive,” and some have even made extreme claims that the slow progress in embodied AI is entirely due to data collection challenges.

Large models consume text, which is abundant across the internet. Robots, however, rely on physical data, each piece of which must be acquired at real cost. In the past, collecting such data presented three major challenges: high cost, poor quality, and non-reusability—forming the "impossible triangle" of embodied data layers.

First in China! The Embodied Data Collection "Black Box" is now officially open source—ending the era of expensive embodied data.

In the XRZero-G0 paper, a clever solution is presented, summarized in one sentence: People wear devices to perform tasks, eliminating the need for robots on-site.

This approach has actually been tried before (e.g., the UMI paradigm), but previously it had a critical flaw: the collected data was like a "black box," leaving you uncertain whether the real device could actually run the code. With XRZero-G0, however, three layers of "security checks" have transformed this black box into a transparent white box.

First security checkpoint: three cameras.

In the past, handheld data collection devices had only single or dual perspectives, which had a drawback: when hands crossed or objects were blocked by arms, the data would be instantly ruined. XRZero-G0 takes a straightforward approach: operators wear a PICO VR headset, with a global camera mounted above their head and a camera attached to each wrist.

First in China! The Embodied Data Collection "Black Box" is now officially open source—ending the era of expensive embodied data.

With these three camera views, six-degree-of-freedom pose information, and edge computing on the backpack for spatiotemporal alignment, accuracy is consistently ≤4 mm—regardless of how you turn, bend, or move, with no occlusion or drift issues.

First in China! The Embodied Data Collection "Black Box" is now officially open source—ending the era of expensive embodied data.

Second security check: Install a virtual limit switch.

People know that human joints are flexible and can perform yoga, but robots cannot. Previously, during remote operation, I performed a movement the robot couldn't execute, and the motor burned out. XRZero-G0 is intelligent—it incorporates automatic inverse kinematics (IK) verification to filter out movements that exceed joint limits.

Third security check: Real-device replay.

After the first two filters, the system randomly selects a portion of the data and sends it directly to a real two-armed robot for "open-loop playback." Only if the robot successfully completes the task will this batch of data be approved for storage.

After passing through the three-layer funnel filter, the effectiveness of the入库 data has been raised to over 85%, matching the usability of real-device data while achieving faster collection speeds.

According to the paper’s data, simple tasks are compressed from 35 seconds to 15 seconds, achieving a 2.33x speedup; complex tasks also run 1.71x faster. The peak collection rate reaches 93.2 trajectories per hour. Isn’t this better than real hardware?

First in China! The Embodied Data Collection "Black Box" is now officially open source—ending the era of expensive embodied data.

However, the above only teaches "how to better collect data"; the more critical aspect of the XRZero-G0 paper is teaching "how to train" the data.

In embodied training, everyone knows to combine "cheap, synthetic data" with "expensive, real-world data," but how should the ratio be set? In the past, it was all guesswork.

The XRZero-G0 team did something particularly thorough: they systematically conducted exhaustive experiments and ultimately discovered a "golden ratio."

Before this, they compared three options:

▪ 500 pieces of genuine machine data (baseline)

▪ 500 real devices + 500 body-free (1:1)

▪ 50 real devices + 500 without bodies (10:1)

The results were surprising: the 10:1 ratio achieved the same success rate as the 500-device baseline—even higher. In simple terms: you cut real-device data usage by 90%, reduced total costs to just one-twentieth of traditional methods, and still trained a model that’s just as smart. A 20-fold improvement in cost efficiency.

The paper explains the underlying reason as the "few-shot physical anchoring effect."

First in China! The Embodied Data Collection "Black Box" is now officially open source—ending the era of expensive embodied data.

It’s not over—this model, trained on this dataset, can also perform zero-shot cross-ontology transfer.

As mentioned earlier, traditional physical teleoperation is most vulnerable to embodiment shifts—raise the table by ten centimeters or switch to a different robot, and the system immediately fails. But XRZero-G0 is backpack-style, allowing operators to move around freely, naturally introducing dynamic variations in viewpoint, height, and lighting during data collection. This rich "noise" actually trains the model to be highly robust.

The paper reveals remarkable details: the model trained on this hybrid dataset was deployed directly on EX001 and CX001 without ever having seen real-machine data—and it successfully performed tasks such as arranging flowers, folding towels, and packing sausages.

First in China! The Embodied Data Collection "Black Box" is now officially open source—ending the era of expensive embodied data.

A brief reflection on "XRZero-G0": The core of this paper is to break down, step by step, like a manual, how to collect data cost-effectively and how to use data efficiently—for practitioners.

Everyone can sense that the embodied AI industry is shifting from "competing on demos" to "competing on data." However, the industry lacks consensus and direction on how to accumulate data volume. XRZero-G0 has shown the industry the entire pipeline: collecting data more easily, identifying the ideal data ratio, and ultimately achieving zero-shot cross-ontology transfer.

This kind of engineering work cannot be accomplished by a single university lab or a renowned scholar alone; it requires a team that understands both academia and industry.

The company behind XRZero-G0 is X-Square Robot.

To understand why Xianding has been able to develop XRZero-G0, look at their strategic path: from day one, the company chose an end-to-end large model approach while simultaneously exploring three routes—VLA, WM, and WUM. Those in the industry know that this path is impossible without solid infrastructure capabilities; thus, from early efforts like WALL-OSS to XRZero-G0, Xianding has consistently built infrastructure related to infrastructure.

This path may be difficult, but it’s the right one. Just look at the capital: in less than two years, it secured nine funding rounds, reaching a valuation of over ten billion. Four major tech giants—ByteDance, Meituan, Alibaba, and Xiaomi—are all on its shareholder list.

As for why XRZero-G0 is fully open source, it’s even simpler and more straightforward.

The "ChatGPT moment" for embodied AI cannot be achieved by a single company working in isolation. When universities, small and medium-sized teams, and individual developers can all use the standardized XRZero-G0 toolchain to generate data at scale, the industry-wide data flywheel will truly begin to turn—and that’s when the moat around the core variables will be established.

The GitHub page for XRZero-G0 is provided below; we encourage everyone to give it a try:

https://github.com/X-Square-Robot/XRZero-G0

Disclaimer: The information on this page may have been obtained from third parties and does not necessarily reflect the views or opinions of KuCoin. This content is provided for general informational purposes only, without any representation or warranty of any kind, nor shall it be construed as financial or investment advice. KuCoin shall not be liable for any errors or omissions, or for any outcomes resulting from the use of this information. Investments in digital assets can be risky. Please carefully evaluate the risks of a product and your risk tolerance based on your own financial circumstances. For more information, please refer to our Terms of Use and Risk Disclosure.