XPeng Unveils X-Mind Framework to Boost Autonomous Driving Prediction and Planning

From：Internet Info Agency 2026-06-29 17:09:10

XPeng Group recently unveiled its X-Mind technical framework, which embeds a predictive world model to endow in-vehicle AI agents with Visual Chain-of-Thought (Visual CoT) capabilities, aiming to enhance the foresight and cognitive reasoning efficiency of autonomous driving systems. X-Mind seamlessly integrates the predictive world model into a large-scale driving model and employs a recurrent block diffusion mechanism. Within a single forward pass, it performs progressive denoising across different internal layers to generate a compact, abstract sketch. This sketch forms a "cognitive canvas," fusing a bird's-eye-view layout with abstract driving priors, encapsulating core semantic information such as lane markings, obstacles, dynamic traffic light states, navigation intent, and compliant speed profiles. The system leverages a Deeply Compressed Autoencoder (DC-AE) to compress the rollout of 12 future frames into just 96 tokens, effectively filtering out irrelevant textural noise while preserving critical road topology and traffic state information—thereby alleviating computational burdens caused by long-context processing. Before action generation, X-Mind executes explicit spatiotemporal rollouts via Visual CoT, enabling the vehicle to anticipate traffic flow changes like an experienced human driver and significantly enhancing defensive driving capabilities. Trained on a dataset comprising hundreds of millions of real-world frames, this technology handles challenging scenarios such as sudden braking by lead vehicles, merging from ramps, and complex intersection negotiations, proactively reasoning about obstacle occupancy and causal chains. Comparative experiments demonstrate that X-Mind outperforms conventional Vision-Language-Action (VLA) models in both lateral and longitudinal trajectory prediction error (ADE), particularly excelling in complex, long-tail scenarios by substantially improving safety and regulatory compliance. Moreover, its inference latency is significantly lower than approaches using raw images or 3D Gaussian Splatting (3DGS) as intermediate representations, making it feasible for mass deployment on automotive-grade chips. Additionally, XPeng CEO He Xiaopeng previously disclosed that the United Nations WP.29 Contracting Parties Meeting has approved DCAS UN Regulation No. 171 Series 02 (pertaining to urban NGP functionality) and UN Regulation ADS (covering L3–L5 autonomous driving). The former will become mandatory in the European Union six months after publication, paving the way for legal deployment of autonomous driving worldwide by the end of 2026.

Editor：NewsAssistant

Most Viewed in 24 Hours

: XPeng MONA L03 Launches: $20K SUV with 1500 TOPS and Gen-2 VLA Autonomous Driving; Tesla Rolls Out FSD V14 Lite to HW3 Owners, Narrowing Autopilot Gap Between Old and New Models; Volkswagen ID. Tiguan EV Spy Shots Emerge, Set to Replace ID.4; China's NEV Sales for June 2026 Released: BYD Hits Record Exports, Leapmotor Leads New EV Makers; Horse Power and Hofer Co-Develop Supercar-Grade 8-Speed DCT Transmission; Mercedes Cost-Cutting Sparks Protests by 90,000 German Workers Over Bonus Delays and Unpaid Overtime; Land Rover Unveils Discovery Landmark Edition, Likely Final Version of Current Fifth-Gen Model; XPeng MONA M03 Deliveries Top 280,000; First SUV MONA L03 Unveiled; Xiaomi Auto Delivers Over 180,000 Vehicles in First Half of 2026, Exceeding 30,000 Monthly for Three Consecutive Months; L3 Autonomous Driving Gains Policy Boost but Faces High Costs, Poor Experience, and Handover Challenges