As artificial intelligence moves toward embodied intelligence, robots are undergoing a "de-fragmentation" revolution. On June 24, RoboScience Machine Science officially released the general embodied large model Visics and disclosed its core technical architecture VLOA (Vision-Language-Object-Action). This development means that robots are no longer limited to repetitive training for single tasks but now have the ability to perform operations across different bodies, objects, and tasks.

In the past, the embodied intelligence industry commonly adopted a "motion replication" model, in which robots memorized specific joint movement trajectories. The biggest problem with this approach is its extremely poor versatility: when the hardware or object changes, the model's capabilities become completely "invalid." According to Tianye Ye, founder and CEO of RoboScience Machine Science, for robots to truly enter the real world, they must solve the problems of poor generalization and difficulty in executing long-term tasks.

image.png

To address this, the Visics model introduced a "3D point cloud trajectory of objects" as a unified intermediate representation standard. Visics internally uses a dual-engine architecture: the embodied world model understands the movement patterns and causal relationships of objects in the physical world through massive video pre-training; while the general operation model converts the predicted trajectories into specific hardware control instructions. This layered and decoupled design enables robots to understand the logic of object movements like humans and then flexibly use different bodies to complete tasks.

To solve the industry challenge of high data acquisition costs and low efficiency, RoboScience has built a "simulation + video" dual data flywheel. Relying on its self-developed high-precision simulation engine RoboMirage, combined with an automated data annotation pipeline, the cost of acquiring a single piece of data has been reduced to less than one percent of traditional methods. Currently, the company is moving towards its goal of building a 1T-scale high-quality dataset by 2026, growing at a rate of tens of thousands of hours of data per week.

image.png

In terms of commercial application, RoboScience chose to start from the "object dimension." Co-founder Wang Tao stated that the company focuses on scenarios such as supermarkets, logistics, and healthcare with a large number of SKUs and high multi-category operation needs, rather than directly competing with existing automation solutions in the industrial field. Currently, the company's technology has started trials in multiple fields including retail and logistics, and plans to achieve mass production of standardized robot body products within the year.

From a single-task executor to a "smart agent" with cross-scenario generalization ability, RoboScience's efforts reflect the trend of embodied intelligence moving from the laboratory to the deep waters of industry. As this soft-hardware integrated solution matures, robots may finally gain the confidence to handle complex dynamic environments and create value in more production and service frontlines.