From:Internet Info Agency 2026-05-13 17:34:00
On May 13, Xiaomi officially launched and open-sourced XiaomiOneVL, a one-step latent-space language-vision reasoning framework. This framework unifies multiple technical approaches—including Vision-Language-Action (VLA), world models, and latent-space reasoning—within a single architecture for the first time, achieving performance improvements in perception, reasoning, and planning tasks for autonomous driving. XiaomiOneVL attains state-of-the-art (SOTA) results on three major benchmarks: ROADWork, Impromptu, and Alpamayo-R1, and demonstrates strong performance on the NAVSIM benchmark. Its reasoning accuracy surpasses explicit Chain-of-Thought (CoT) methods, while its inference speed matches that of latent-space CoT approaches that predict answers directly without intermediate reasoning steps. The framework supports dual interpretability in both language and vision, enabling it to simultaneously explain decision rationales in text and visualize future scenarios through predicted images. Xiaomi has open-sourced the model weights, training and inference code for XiaomiOneVL, along with its technical report and project homepage, making them available to the broader research and industry community.

Tesla China Launches Official Model Y Sunshade – ¥1,499, Fits Only Models Built After February 2025
74-Year-Old Man Ignites Poplar Fluff, Sparking Fire That Destroys 20 EVs in Dalian Parking Lot
Xpeng MONA Series' First SUV, L03, Spotted; Launch Planned for 2026
Solid-State Batteries Accelerate Deployment but Remain in Early Development, Experts Say
Hitachi and Ricoh Co-Develop Modular EV Battery Factory System
Audi Splits China Operations North-South to End Internal Rivalry