How Applied Intuition Brings World Foundation Models from Research to Reality
World foundation models unlock new ways to build and validate physical AI. Applied Intuition provides the data pipelines, simulation tooling, and evaluation needed to put them to work.
Alexandre El Assad, Gautham Sholingar • June 18, 2026 • 6 min read
World foundation models (WFMs) are redefining how developers can build and validate physical AI systems. These models can generate photorealistic sensor data, predict novel trajectories and world states, and reason about physical interactions—all capabilities that every autonomy program needs in its development pipeline. However, using these models effectively in production requires substantial tooling and infrastructure, including prompting and experiment tracking, GPU orchestration, fine-tuning and inference pipelines, realism and quality metrics, and the ability to condition models with formal representations grounded in real world scenarios. Applied Intuition brings together a decade of physical AI domain expertise in building state-of-the-art simulation tools to build a production grade solution for WFMs as part of a modern data and simulation flywheel.
Why World Foundation Models?
Modern end-to-end autonomy stacks need a data flywheel to translate fleet datasets into curated, labeled segments for training and validation. However, real-world data collection alone cannot cover the breadth of environmental conditions and safety-critical edge cases needed for rigorous evaluation and validation of an autonomy stack. This makes high-fidelity sensor simulation a must-have capability for any production autonomy program.
Existing approaches for sensor simulation fall short in certain ways: Physics-Based Rendering (PBR) offers precise scene control but requires iterative tuning of 3D content and sensor models to reduce the domain gap, while neural reconstruction (e.g., 3D/4D Gaussian splatting) is limited to routes already driven by the fleet and lacks generalization across lighting, weather, and geometry variations. World Foundation models (WFMs) address both gaps by leveraging large-scale internet pretraining and fine-tuning for each physical AI domain to generate diverse, domain-aware sensor data across varied Operational Design Domains (ODDs), weather, and lighting conditions. WFMs naturally complement PBR and neural simulation, augmenting Applied Intuition's existing simulation toolchain to accelerate the development of autonomous systems and physical AI.
In this blog, we will focus on a workflow to translate real-world drive logs as well as variations of scenarios extracted from real-world drives into novel sensor datasets using NVIDIA Cosmos Transfer 2.5/auto/multiview. The following sections will talk through the core capabilities needed to leverage WFMs in a production autonomy pipeline.
How Applied Intuition Brings WFMs to Production
Curated, labeled fleet datasets
The first step in this pipeline is creating the labels needed to condition the WFM inference. Applied Intuition has built a complete simulation and data engine that can convert raw fleet data collection into curated, labeled segments that can be used for downstream tasks.
Curated, labeled sensor data from Applied Intuition’s vehicle fleet
Scenario, map and sensor configuration extraction
The next steps of this pipeline focus on extracting a formal scenario representation describing all static and dynamic actors in the scene, grounded on an extracted map. Maps can be derived from OpenStreetMap (OSM) or proprietary map formats tied to geo-coordinates or inferred from sensor data. An additional step at this stage is tuning and matching the sensor configuration to the required layout for a specific vehicle fleet.
Extracted object-level simulation scenario aligned to a map representation and sensor layout tuned to match a real-world vehicle
Conditioning for world foundation model inference
To condition the model, we rendered an object-level representation in the image plane that adheres to a desired sensor configuration, based on the labels generated by our auto-labeling pipeline on Applied Intuition’s vehicle fleet.
Lane and object-level representation of an extracted scenario needed to condition the world foundation model inference result
Post-training and WFM inference
A key challenge with WFMs is that running model inference with conditioning for unseen sensor configurations produces geometric inconsistencies and misalignment with the target sensor setup. Every autonomy program has unique sensor configurations and this mismatch needs to be addressed for production use.
We address this through Applied Intuition’s Data Engine, which ingests, filters, auto-labels, and curates fleet logs into post-training-ready datasets. To test this pipeline, we used a corpus of curated clips and captions from our internal AV fleet. We post-trained the NVIDIA Cosmos Transfer 2.5B multiview model for a custom sensor configuration, achieving geometrically consistent and well-aligned multi-view outputs verified both visually and through our evaluation harness. This step is essential to ensure that the model generalizes to new sensor configurations as well as new ODD distributions in the future, unlocking the use of foundation models in production for physical AI developers across domains.
Improved adherence of generated sensor data to conditioning labels pertaining to the desired sensor configuration layout with post-training of the world foundation model
To simplify the user experience, we’ve built an agentic workflow that generates prompts, selects parameter values, spins up compute instances, and runs model inference on the object and lane-level conditioning to generate novel sensor datasets—all from a natural language prompt. Job scheduling, GPU allocation, and scaling are handled seamlessly, freeing developers to submit fine-tuning runs and launch large-scale inference workloads without the overhead of managing compute.
Agentic prompting and inference pipeline to create novel datasets using a post-trained NVIDIA Cosmos Transfer model
Developers can prompt the model to create novel variations of weather, lighting, and scene content with natural language prompts and benefit from the internet-scale pretraining of the NVIDIA Cosmos models to generate a variety of conditions easily.
Novel variations (hazy conditions, night time and snowy weather) generated with style transfer on input camera data using NVIDIA Cosmos Transfer 2.5B
Validation with purpose-built evaluation
Generating data is only half the problem. Before any synthetic data enters a training or validation pipeline, teams need confidence that the data will help improve downstream model performance. We've built an evaluation harness that computes perceptual and geometric consistency metrics on the rendered sensor data. However, quality metrics alone aren’t sufficient, we also need to measure how much the model adheres to the conditioning input.
We integrated NVIDIA's open-source Cosmos Evaluator into our pipeline to automatically run checks that are purpose-built for autonomy, such as an obstacle-correspondence check and a hallucination-detection check on every batch of generated data.
Metrics-based validation of generated sensor data with image quality, novel view metrics as well as NVIDIA Cosmos Evaluator
Scenario augmentation in simulation
Another important kind of augmentation is behavioral augmentation. Applied Intuition’s tooling allows developers to augment extracted real-world scenarios and introduce new events such as vehicle cut-ins, jaywalking pedestrians, and unseen obstacles on the road. By modifying the object-level scenario extracted from the original drive log, we can generate new conditioning data to run WFM inference and create novel sensor datasets. This effectively turns a single real-world drive log into a family of related scenarios, greatly expanding the value of fleet datasets.
Augmentation of original drive log scenario to create a new cut-in scenario. WFM inference conditioned on this novel scenario enables generation of counterfactual sensor data to simulate real world edge cases
A Comprehensive Pipeline for Physical AI Development with World Foundation Models
WFMs represent a new way of developing and validating Physical AI. Applied Intuition brings the physical AI domain expertise and a complete toolchain needed to put world foundation models to work—data pipelines, simulation tooling, fine-tuning infrastructure, GPU orchestration, agentic interaction layers, and metrics-based validation, all in one production-ready platform.
The gap between a powerful model and a production-ready workflow is where most teams get stuck, and Applied Intuition is closing that gap. We are excited to partner with NVIDIA to build this next frontier in tooling for physical AI systems, bringing world foundation models from research to reality and helping developers safely accelerate their path to production autonomy.
Alexandre El Assad is a Software Engineer at Applied Intuition focused on simulation and 3D graphics for autonomous vehicle development. He holds an MS in Aeronautics and Astronautics from Stanford University, with a specialization in Autonomous Systems and Controls, and previously worked as a Senior Software Engineer at Acubed, Airbus's Silicon Valley innovation center.
Gautham Sholingar
Product Manager
Gautham Sholingar leads product management for autonomy tools at Applied Intuition. He brings a decade of experience in simulation engineering and product management, with prior roles at NVIDIA, Ford, and MathWorks. He holds an MBA from UC Berkeley's Haas School of Business, an MS in Electrical Engineering from Caltech, and a BS in Electrical Engineering from the University of Michigan, Ann Arbor.