Alibaba's AutoNavi has officially launched its self-developed world model "FantasyWorld." Leveraging its massive real-world navigation data advantage, this model quickly secured the top position in the comprehensive score on the international authoritative benchmark WorldScore Leaderboard, further expanding Alibaba's layout in the field of AI foundational models. FantasyWorld focuses on high-quality 3D world construction and has become a new focus in the fields of embodied intelligence and autonomous driving.
Core Technical Breakthroughs of FantasyWorld
FantasyWorld aims to provide high-quality 3D world models for embodied intelligence and general artificial intelligence (AGI). Its innovation lies in: adding a trainable geometric branch on a frozen video-based model backbone, achieving joint modeling of "video latent variables" and "implicit 3D fields," which can be completed with just one forward computation.

This design significantly enhances the visual realism of generated videos while greatly improving multi-view consistency and geometric fidelity. Compared to recent methods for geometric consistency, FantasyWorld performs well in multi-view collaboration, style consistency, and maintaining object shape and texture under extreme views (such as 180° rotation). The 3D latent variables generated by the model can be directly decoded into depth maps or point clouds, supporting downstream tasks without additional optimization.
Top of WorldScore: Proof of International Recognition
WorldScore is a unified world generation benchmark led by Professor Fei-Fei Li's team at Stanford University, covering multi-dimensional evaluations such as static/dynamic scenes, controllability, and consistency. Currently, FantasyWorld ranks first in both the overall score and key metrics (such as a static world score of 78.55 and a dynamic world score of 66.89), surpassing multiple domestic and international competing models.
The related paper has been accepted by top conferences such as ICLR 2025 and NeurIPS 2025. AutoNavi stated that the model will be open-sourced soon, further promoting academic and industrial collaboration.
Practical Application: Flying Street View Brings New Spatial Intelligence Experience
FantasyWorld has been first applied to AutoNavi's "Flying Street View" feature. Merchants need only upload a few short mobile phone videos to generate high-fidelity 3D virtual street view tours for free, helping users to experience in advance the layout of restaurants, seating areas, and other details, while also helping offline merchants increase traffic.
This feature is seen as an embodiment of "technological equity," lowering the threshold for professional 3D modeling. AutoNavi has also established an embodied business department internally, exploring directions such as robots and robotic dogs, and comprehensively shifting toward physical AI in combination with spatial intelligence.
Industry Impact: The Era of World Models is Accelerating
With the shift of autonomous driving to end-to-end visual language action (VLA) solutions and the rapid development of embodied intelligence, world models that pursue physical realism and 3D consistency have become increasingly important. The launch of FantasyWorld not only strengthens Alibaba's presence in the multimodal AI landscape but also highlights the advantages of Chinese enterprises in spatial intelligence driven by real-world data.
AIbase Perspective: FantasyWorld marks a leap from video generation to interactive 3D simulation in world models, which will profoundly impact the future of AR/VR, robot navigation, and digital twins. With its accumulation of hundreds of millions of user data, AutoNavi may gain a competitive edge in the physical AI track. AIbase will continue to monitor its open-source progress and more application implementations, providing in-depth analysis for readers.
