Robot training is facing a more fundamental issue: it’s not that models aren’t powerful enough, nor simply that there aren’t enough chips, but that high-quality data for learning real-world physical tasks remains scarce. TechCrunch reports that XDOF, which has just ended its stealth operation, is trying to fill this gap.
Announce funding and make a public debut
XDOF was founded in October 2024 by Philippe Wu, Fred Shentu, and Nemo Jin. The company announced the completion of a $70 million funding round, with investors including Thrive Capital, Spark Capital, a16z, Lux, and WndrCo.
Wu said the company currently has about 60 employees and has partnered with 20 clients, including several leading AI labs, though specific names have not been disclosed.
It is harder to obtain training data for bots.
Unlike large models that can directly use public text, robots require data that reflects physical interactions such as grasping, moving, and assembling. Such data is rarely available in public environments, and existing video footage is difficult to directly convert into trainable action data.
Wu encountered this issue while pursuing his PhD at Berkeley. According to him, the industry at the time didn’t even have sufficiently large datasets, causing model training to get stuck in a circular dilemma of whether data or models came first.
Release the ABC dataset first.
As the first implemented project, XDOF is collaborating with the University of California, Berkeley’s AI Research Lab to release a robotic training dataset named ABC. The company states that this will be one of the largest high-quality robotic training datasets available to date.
- 130,000 rows of bot operation trajectory data
- 300 hours of simulated data
- 100-hour evaluation data
The report mentions that this data has been used to train robots to perform benchmark tasks such as folding T-shirts, flattening cardboard boxes, and returning AirPods to their charging case.
Plan outsourced expansion of data collection capabilities
XDOF plans to build a three-layer data system covering target robot teleoperation data, general teleoperation device collection data, and human first-person task data. To obtain the third type of data, the company also plans to develop its own wearable sensors.
In addition to data collection, XDOF also aims to provide data cleaning, toolchains, and annotation systems. The company believes that such services require large-scale facilities, robotic maintenance, parameter calibration, and operator training, and that most AI labs prefer to outsource these tasks—making this the market direction XDOF is betting on.
