What Problem Was General Mills Actually Solving?
General Mills ships food — Cheerios, Häagen-Dazs, Betty Crocker, Old El Paso — from manufacturing plants to distribution centers across a network that never fully stops moving. For decades those decisions depended on planners working through order queues in batch cycles, often reviewing a snapshot of the network that was already hours old. The problem is that batch planning creates a structural lag between the real state of the network and the decisions being made about it. A carrier that suddenly has cheaper capacity, a warehouse constraint discovered at 3 a.m., a weather event rerouting freight in Kansas — none of that enters the planning cycle until the next batch run. General Mills estimated the full order-to-optimized-load cycle took up to 18 hours. That is 18 hours of value left on the table every day, across thousands of shipments.
What Is ELF and What Does It Actually Do?
ELF stands for End to End Logistics Flow. General Mills built the system in collaboration with Palantir, initially deploying it in the U.S. human foods business. By the time executives disclosed expanded performance figures at the February 18, 2025 CAGNY investor conference, ELF was assessing more than 5,000 daily shipments across the broader network.
The system functions as a continuous AI-driven execution layer sitting above transactional data. Rather than waiting for a planner to pull a report, ELF ingests real-time inputs — cost structures, carrier capacity, delivery windows, weather, and greenhouse gas targets — and continuously resurveys open orders. When it identifies an opportunity to consolidate loads, shift carriers, or resequence shipments, it surfaces a recommendation. Approximately 70% of those recommendations accept automatically without human review. The remaining 30% route to a planner queue for evaluation and override. What previously consumed a full working day now takes less than 30 minutes.
ELF is not fully autonomous. It is a high-velocity recommendation engine that captures most of the optimization value while preserving human judgment for edge cases — a deliberate architectural choice, not a limitation.
How Does the AI Optimization Engine Actually Work?
The technical foundation is a Connected Data Model (CDM) built jointly with Palantir on the Palantir Ontology — a semantic data layer integrating approximately 200 master and operational data tables into a single source of truth. Without a live unified view of inventory positions, carrier contracts, lane costs, and order commitments, any optimization model just runs against stale data.
On top of that foundation, Palantir AIP provides the orchestration layer: the logic defining what constitutes a recommendation, the constraints governing acceptable solutions, and the confidence scoring that determines whether a recommendation auto-accepts or routes to human review. The optimization uses constraint-based and cost-minimization approaches well-established in operations research — vehicle routing problem (VRP) solvers applied at scale — but the differentiating factor is continuous execution rather than batch runs.
Google Cloud provides the underlying infrastructure: BigQuery as the enterprise data warehouse, Vertex AI supporting the ML models that score recommendations and learn from acceptance signals, Looker for analytics visibility, and Apigee managing API connectivity. SAP S/4HANA handles the transactional ERP core, feeding order management and inventory data into the CDM.
What Do the Performance Numbers Actually Mean?
The headline figures — more than $20 million in transportation savings since the 2024 fiscal year, 70% auto-acceptance rate, 5,000+ daily shipments assessed — come from General Mills executives at CAGNY 2025. These are self-reported figures disclosed at an investor conference, not independently audited results. That distinction matters.
With that caveat stated, the numbers are internally consistent. The 70% auto-acceptance rate reflects meaningful model calibration: too low, and the system generates noise; too high, and you should question whether the human review layer adds genuine value. Seventy percent suggests useful recommendations while routing genuinely ambiguous cases to humans. General Mills CFO Kofi Bruce attributed the savings directly to reduced transportation costs and improved customer service levels. The company has also projected more than $50 million in manufacturing waste reduction from real-time production data — but that is a forward projection, not a realized result, and should be read as such.
Traditional Batch Planning vs. AI-Driven Continuous Optimization
| Dimension | Traditional Batch Planning | AI-Driven Continuous Optimization (ELF) |
|---|---|---|
| Planning frequency | Once per cycle (hours apart) | Continuous (100+ re-evaluations/day) |
| Data freshness | Hours-old snapshot | Near real-time |
| Decision latency | Up to 18 hours | Under 30 minutes |
| Recommendation volume | One optimized plan per cycle | Hundreds of discrete recommendations/day |
| Human involvement | Primary decision-maker | Reviewer for ~30% of flagged cases |
| Inputs considered | Cost, capacity | Cost, capacity, weather, GHG, service levels |
| Scalability | Linear with headcount | Scales with compute, not planners |
What Are the Honest Limits?
ELF optimizes execution within an existing network — it does not redesign it. Decisions about where to locate distribution centers, which contracts to negotiate, or how to restructure physical goods flow remain strategic and human-led. ELF makes the best of the configured network; it does not tell you when the configuration itself is the bottleneck.
The system also depends entirely on data foundation quality. The 200-table CDM took years to build. Enterprises lacking clean, unified operational data cannot shortcut that phase. The optimization layer is only as good as what it consumes. General Mills has also not publicly disclosed the override rate within the 30% human-reviewed cohort — how often planners actually reject ELF’s suggestions — which would reveal where the model still falls short.
Finally, ELF addresses transportation cost optimization. It does not solve demand forecasting accuracy, supplier reliability, or the structural causes of disruption. It is a powerful execution tool, not a supply chain strategy.
What Does General Mills’ Full AI Suite Actually Look Like?
ELF sits within a technology ecosystem General Mills has assembled since 2019. Google Cloud anchors the infrastructure, with BigQuery as the enterprise data warehouse — General Mills completed a 36-month cloud migration in 21 months, moving roughly 85% of HQ-hosted applications. Palantir contributes the semantic layer (the Ontology / CDM) and the AIP application framework where ELF’s recommendation logic runs. SAP S/4HANA handles transactional ERP data. Vertex AI and Looker provide ML and analytics capabilities, with Apigee managing API connectivity between systems.
The practical implication: General Mills did not buy a single supply chain AI product. They assembled a stack — cloud infrastructure, semantic data layer, AI orchestration platform, ERP backbone, and ML tooling — and ELF is the application sitting on top of all of it.
Can You Replicate This With Open-Source Tools?
Yes, meaningfully, but not trivially. The optimization algorithms at ELF’s core are not proprietary. Google’s OR-Tools (github.com/google/or-tools) handles VRP solvers, scheduling, and constraint programming at production scale — exactly this class of problem. Timefold Solver (github.com/TimefoldAI/timefold-solver), the successor to OptaPlanner, provides a high-level optimization framework with built-in routing constraints that maps well onto logistics recommendation systems. VROOM (github.com/VROOM-Project/vroom) specializes in vehicle routing with a REST API that embeds cleanly in operational systems. Pyomo and NVIDIA cuOpt extend the options for mathematical programming and GPU-accelerated routing at large scale.
A realistic implementation follows three phases. The first is the data audit: before touching optimization code, inventory every data source that touches a shipment — order management, carrier contracts, lane rates, warehouse capacity, weather feeds. Map the quality gaps, ownership questions, and update frequency of each source. This phase typically reveals that the hard part is not the algorithm but building a trustworthy, live data foundation. Budget more time here than feels reasonable. The second phase is shadow mode: run the optimization engine in parallel with existing planning for a defined period, generating recommendations planners can see but that do not yet execute automatically. Track how often shadow recommendations match planner decisions — that percentage is your realistic auto-acceptance ceiling before automation begins. The third phase is designing the human review queue: the interface through which planners evaluate the 20–40% of recommendations that should not auto-execute. Each recommendation should display the confidence score, the key trade-offs weighed, and a one-click override with a reason code. Those reason codes are training data; they tell you exactly where the model needs improvement.
What Can Enterprises in Other Industries Take From This?
The most transferable insight from ELF is architectural, not algorithmic. General Mills did not solve a harder optimization problem than anyone else in logistics. They solved the organizational and data engineering problem of making continuous optimization operationally real. The algorithms existed for decades. The breakthrough was a unified, live data foundation combined with an auto-accept threshold that lets the model operate autonomously on high-confidence decisions while routing uncertainty to humans.
That pattern applies to any industry with high-volume, recurring operational decisions — insurance claims routing, hospital staffing, energy dispatch, manufacturing scheduling. The question is not whether AI optimization adds value; it almost certainly does. The question is whether the data foundation exists to make AI-driven decisions trustworthy enough for the majority of cases to execute without human review. General Mills built that foundation over several years. Enterprises that treat the algorithm as the hard part and data infrastructure as an afterthought will discover the actual ordering of difficulty the hard way.