Rosa Del Mar

Daily Brief

Issue 83 2026-03-24

Mlops Pipeline Simulation And Closed Loop Learning

Issue 83 Edition 2026-03-24 9 min read
General
Sources: 1 • Confidence: Medium • Updated: 2026-03-25 17:55

Key takeaways

  • Waymo's simulator runs off-board rather than on the vehicle.
  • Waymo's sixth-generation 'Ojai' platform is a custom passenger-oriented vehicle planned to begin rolling out publicly this year.
  • Waymo provides over 500,000 fully autonomous rides each week.
  • The Waymo Driver uses a 360-degree multi-sensor suite combining cameras, lidar, and radar.
  • Waymo plans to start operating in London and Tokyo this year and does not expect to deploy the San Francisco driver there without additional data collection, specialization, and validation.

Sections

Mlops Pipeline Simulation And Closed Loop Learning

  • Waymo's simulator runs off-board rather than on the vehicle.
  • Waymo uses an off-board foundation model that is specialized into high-capacity teacher models (Driver, Simulator, Critic) and distilled into smaller models for fast on-vehicle inference.
  • Waymo frames driving as a multi-agent social interaction problem where context and history matter, and says this explains why large pretrained models can appear to drive in nominal cases while remaining far from full-autonomy safety requirements.
  • Waymo argues that passive observation and imitation are insufficient for the long tail and that closed-loop training (including reinforcement-learning-based fine-tuning) supported by a realistic simulator and a critic-like reward signal is required.
  • Waymo says progress in foundational AI world models is enabling major simplification of the autonomy system, lowering cost and improving global scalability.
  • Waymo observed a case where the vehicle detected and responded to a pedestrian occluded by a bus using a very noisy signal from peripheral lidar reflections bouncing under the bus.

Hardware Generation Transition And Cost Down

  • Waymo's sixth-generation 'Ojai' platform is a custom passenger-oriented vehicle planned to begin rolling out publicly this year.
  • Waymo's first fully autonomous commercial service began in 2020 in Chandler, Arizona using a fourth-generation system, and the shift to the fifth generation involved collecting data broadly across the U.S. and a major AI-centric software jump that enabled harder deployments like San Francisco.
  • Waymo's sixth-generation hardware keeps camera-radar-lidar modalities but is simplified and significantly lower cost, with sensor-plus-compute hardware claimed to be a fraction of prior cost and comparable to a high-end driver-assistance system.
  • Waymo states its driver software largely generalizes across vehicle platforms and sensor configurations, with sixth-generation hardware planned to be deployed on Ojai and on other vehicles such as the Hyundai Ioniq later this year.
  • Waymo says automotive radar costs have dropped drastically over time due to mass automotive supply chains, while imaging radar remains costlier but is trending downward.
  • Waymo says lidar costs are following a predictable cost-down trend and that it is simplifying and optimizing lidar designs using learnings from prior generations.

Commercial Scale And Deployment Cadence

  • Waymo provides over 500,000 fully autonomous rides each week.
  • Waymo operates about 3,000 cars.
  • Waymo drives over 4 million fully autonomous miles per week.
  • Waymo has fully autonomous operations in 11 U.S. cities, with public riders in 10, and Nashville as a newly started non-public city.
  • Waymo opened rider access in four new cities in a single day, contrasting with an earlier period when multi-city external rider operations took about eight years to reach.

Runtime Architecture Edge Inference And Multimodal Sensor Fusion

  • The Waymo Driver uses a 360-degree multi-sensor suite combining cameras, lidar, and radar.
  • Waymo describes lidar as providing high-resolution 3D sampling while radar provides lower-resolution sensing that degrades better in fog, snow, and heavy rain, making them complementary rather than interchangeable.
  • Waymo runs real-time driving inference on the vehicle, using cloud connectivity only for non-real-time auxiliary tasks.
  • Waymo still uses cameras, radar, and lidar, and says it has significantly optimized and simplified all three across generations.
  • Waymo does not switch between sensors by environment; it encodes each modality and fuses them jointly to estimate the world.

Deployment Model Odd Specialization And Environmental Constraints

  • Waymo plans to start operating in London and Tokyo this year and does not expect to deploy the San Francisco driver there without additional data collection, specialization, and validation.
  • Waymo considers pickup and drop-off behavior a nuanced autonomy challenge involving rider intent, curb context, and minimizing disruption such as blocking driveways or double-parking.
  • Waymo frames deployment readiness in terms of operating domains (such as freeways, weather, fog, and density) rather than city boundaries.
  • Waymo identifies cold winter weather as a weak point for generalization because it affects hardware needs such as sensor cleaning and heating elements and slippery-surface control, not just AI.
  • Waymo believes its core driving technology has moved from deep research into a phase of accelerated global scaling, with remaining work focused on specialization and validation rather than fundamental capability gaps.

Watchlist

  • Waymo plans to start operating in London and Tokyo this year and does not expect to deploy the San Francisco driver there without additional data collection, specialization, and validation.
  • Waymo's sixth-generation 'Ojai' platform is a custom passenger-oriented vehicle planned to begin rolling out publicly this year.

Unknowns

  • What are the definitions behind the reported scale metrics (rides, miles, fleet): service hours, geofence sizes, paid vs free/promotional rides, and ride completion rates?
  • What safety performance evidence supports the claims of readiness to scale (e.g., incident rates, intervention rates, near-miss metrics, and how these vary by ODD)?
  • What are the unit economics at current scale (cost per mile/ride, depot labor per vehicle-day, and hardware depreciation assumptions), and how sensitive are they to cleaning and charging workflows?
  • What is the actual timeline and regulatory status for London and Tokyo operations, and what scope (ODD, service area, hours) will those operations initially support?
  • How large is the remaining gap for winter operations, and what specific hardware and control-system changes are required to close it (sensor cleaning, heating, traction control), including their cost and maintenance impact?

Investor overlay

Read-throughs

  • If sensor plus compute cost is materially reduced while maintaining multimodality, fleet economics could improve, enabling faster scaling and more markets. Read through to suppliers of lidar, radar, compute, and fleet operations tooling, contingent on disclosed cost and volume assumptions.
  • A repeatable operational playbook and higher rollout cadence could increase autonomous ride volumes and geographic footprint. Read through to partners in fleet services such as depots, cleaning, charging, mapping, and local operations, contingent on verified definitions for rides and service scope.
  • Closed loop training with simulation and scalable evaluation suggests progress on long tail safety and faster iteration. Read through to ML infrastructure, simulation, and data tooling needs, contingent on demonstrated safety and validation metrics by operating domain.

What would confirm

  • Clear definitions and audited reporting for rides, miles, service hours, geofence size, paid versus promotional mix, and completion rates, showing sustained growth across multiple cities.
  • Disclosed or independently validated unit economics at current scale, including cost per mile or ride, depot labor per vehicle day, and hardware depreciation, plus sensitivity to cleaning and charging workflows.
  • Concrete London and Tokyo timeline and regulatory status, with initial operating domain scope and a plan for required data collection, specialization, and validation that matches actual launch execution.

What would kill

  • Evidence that cost reductions require unrealistic volumes or specific suppliers, or that multimodal hardware simplification degrades reliability, particularly in adverse weather and complex pickup drop off scenarios.
  • Safety performance metrics such as incident, intervention, and near miss rates fail to improve with scale, or vary materially worse when transferring to new operating domains, delaying expansion.
  • London and Tokyo deployments are delayed or restricted due to insufficient validation, or winter and cleaning requirements impose high maintenance and downtime that prevents scalable service hours.

Sources