Allianz AI: How Europe's Largest Insurer Built Fraud Detection, RAG Underwriting, and a 7-Agent Claims Pipeline

What Problem Was Allianz Actually Solving?

Allianz is the largest insurance group in Europe by revenue, operating across 70 countries with roughly 125,000 employees and $160 billion in annual revenue. At that scale, insurance fraud is not an edge case. It is a structural cost. Industry estimates put fraudulent claims at 5 to 10 percent of total insurance payouts globally. For a group handling tens of millions of claims per year, the dollar exposure is material.

The fraud problem has a particular shape in insurance that makes it hard to address with rules alone. Fraudulent claimants do not announce themselves. They submit claims that look legitimate on paper, often with supporting documentation, and the patterns that distinguish fraud from genuine hardship are subtle: slight inconsistencies in reported timelines, voice stress in telephone interviews, third-party data that contradicts stated circumstances. Traditional rule-based systems catch obvious fraud. The sophisticated kind requires probabilistic reasoning across multiple data sources simultaneously.

The underwriting problem is different in character but equally consequential. Allianz’s specialty and commercial lines underwriters work from a corpus of guidance documents, some running to 600 pages or more, that encode the company’s risk appetite, pricing logic, and regulatory requirements. Retrieving the right section of the right document at the right moment during a live underwriting decision is slow, error-prone, and increasingly impractical as the knowledge base grows. New hires face months of onboarding before they can navigate the full guidance corpus fluently.

The claims automation problem is the third thread. Routine, low-value claims, food spoilage under a set threshold, for example, consume the same intake, triage, and administrative overhead as complex claims. That overhead is pure cost on interactions where the settlement outcome is rarely in doubt.

Allianz built three separate systems to address each of these problems: Incognito for fraud, BRIAN for underwriting guidance, and Project Nemo for claims automation.

Incognito: ML Fraud Detection Across the Claims Lifecycle

Incognito is Allianz’s fraud detection platform, developed in partnership with Accenture and integrating third-party data signals from two specialist vendors.

The core architecture is supervised machine learning trained on historical claims data. The model ingests claim attributes, including coverage type, reported circumstances, claim amount, and claimant history, alongside real-time third-party feeds from Carpe Data (social media and public records signals that flag inconsistencies between claimed circumstances and observable behaviour) and Clearspeed (voice analytics that assess the vocal characteristics of telephone interview responses for markers associated with deception).

The combination matters because each signal type catches a different failure mode. The ML model catches statistical anomalies in claim patterns. Carpe Data catches cases where a claimant’s stated situation contradicts their public digital footprint. Clearspeed catches cases where the claim narrative is internally consistent but the person delivering it is showing physiological stress markers. Individually, each signal produces false positives. Layered together, they narrow the field to cases that genuinely merit investigation.

Allianz UK reported £77.4 million in detected fraud for 2023 and £93 million in H1 2025 across the combined tools. The H1 2025 figure covers all fraud detection tools, not Incognito in isolation, and the methodology behind either number is not independently audited. With that caveat, the directional picture is consistent: material fraud volume is being surfaced earlier in the claims lifecycle than it was under the previous rules-based approach.

BRIAN: RAG-Based Underwriting Guidance at Scale

BRIAN, short for Better Retrieval and Insights for Allianz Now, is a retrieval-augmented generation system built on AllianzGPT, the company’s internal Azure OpenAI GPT-4o deployment.

The system indexes 70 guidance documents, including policy manuals, risk frameworks, regulatory guidance, and pricing references, some exceeding 600 pages. When an underwriter poses a question in natural language, BRIAN retrieves the most relevant sections using vector similarity search, passes them to the LLM as context, and generates a grounded answer with citations. The citations are the critical design choice: rather than generating an answer from parametric memory (what the model learned during training), BRIAN anchors every response to specific document passages that the underwriter can verify directly.

Since January 2025, BRIAN has served approximately 260 commercial lines underwriters, answering over 13,000 questions and saving an estimated 65,000 minutes of document search time. Allianz converts that to 135 working days. The figure is self-reported and the calculation methodology is not disclosed, but the order of magnitude is consistent with the use pattern: 13,000 queries across 260 users over roughly five months averages to about one query per user per day, and 65,000 minutes across 13,000 queries is five minutes saved per query. That is a plausible unit economics estimate for a question that previously required manual document navigation.

The system does not replace underwriter judgment. It retrieves and synthesises guidance faster than a human can search manually. The decision, including whether the retrieved guidance is applicable, remains with the underwriter.

Project Nemo: 7-Agent Claims Pipeline in Under 100 Days

Project Nemo is the most structurally novel of the three systems. It is a multi-agent AI pipeline that handles end-to-end processing of food spoilage claims under AUD$500 in Australia, from intake through settlement, without human intervention.

The pipeline uses seven specialized agents: a Planner that routes each claim and manages workflow state, a Cyber agent that validates incoming data and assesses digital integrity, a Coverage agent that confirms policy applicability, a Weather agent that cross-references reported spoilage events against meteorological data, a Fraud agent that applies the Incognito-style scoring logic, a Payout agent that calculates and authorises the settlement amount, and an Audit agent that logs every decision step for compliance review.

Each agent operates on a defined information subset. The Weather agent, for instance, does not have access to the claimant’s policy history. The Payout agent does not process raw claim inputs directly; it receives structured outputs from upstream agents. This design limits the blast radius of any single agent error and creates a clear audit trail.

Allianz reports an 80 percent reduction in claims processing and settlement time, and Project Nemo was deployed in under 100 days in July 2025. Both figures require context. The 80 percent reduction is measured relative to the previous manual intake and triage process. For a low-value claim with a clear settlement outcome, most of the elapsed time in the old process was queue time: the claim sitting in a triage inbox waiting for a human to open it. Automating the queue is not the same as accelerating the actual judgment work. The genuine speed gain is real, but it is primarily queue elimination rather than a ten-times improvement in claims adjudication capability.

The 100-day deployment is notable as a delivery benchmark. For a system touching financial settlements and regulatory obligations, that timeline suggests either a deliberately constrained scope (one claim type, one geography, one value band) or a mature internal platform that reduced integration work significantly. Allianz has not published which.

Traditional Claims Process vs. Allianz AI Pipeline

Dimension	Traditional Manual Workflow	Allianz AI System
Fraud detection	Rules-based flags plus investigator judgment	Supervised ML + voice analytics + social signals (Incognito)
Underwriting guidance	Manual document search, 600-page PDFs	RAG on indexed corpus with citations (BRIAN)
Low-value claims	Manual intake, triage, adjudication queue	7-agent automated pipeline end-to-end (Project Nemo)
Internal AI access	Limited to trained specialists	60,000+ employees on AllianzGPT, 95% weekly engagement
Knowledge retrieval	Search index or institutional memory	Vector similarity over document corpus
Audit trail	Human case notes	Structured per-step logs from Audit agent
Processing time (simple claims)	Days to weeks	Hours to minutes (queue eliminated)

What Are the Honest Limits?

Every headline figure Allianz has published for these systems comes from investor presentations, company press releases, or executive interviews. The £93m fraud detection figure, the 65,000 minutes saved by BRIAN, and the 80% Project Nemo reduction are all self-reported with no independent methodology audit. That is standard practice for enterprise AI deployments that are less than two years old. It is also worth holding in mind when evaluating the claims.

The Incognito false positive rate is never disclosed. In fraud detection, the false positive rate is arguably more consequential than the detection rate: a genuine claim wrongly flagged for investigation delays settlement, damages the customer relationship, and creates regulatory exposure. A system that detects £93m in fraud while generating a high false positive rate may not be net positive for customer outcomes even if it is net positive for fraud losses. Allianz has not published this figure.

BRIAN’s error rate is similarly undisclosed. RAG systems produce confident-sounding hallucinations when retrieved passages are ambiguous or the question falls outside the indexed corpus. For a 260-person underwriting team making binding financial commitments, an incorrect BRIAN response that goes unchecked is a liability. The citation mechanism mitigates but does not eliminate this risk. The safeguard depends on underwriters actually reading and verifying the cited passages rather than accepting the generated summary.

Project Nemo’s scope is deliberately narrow. Food spoilage claims under AUD$500 in one geography represent the simplest possible claim type: low value, binary coverage, verifiable trigger event. The 7-agent architecture would require significant redesign for claims involving bodily injury, disputed liability, or regulatory complexity. Allianz has not announced plans to expand the model beyond this initial scope.

The Anthropic Claude partnership announced in January 2026 has not resulted in publicly described production deployments. AllianzGPT’s primary inference layer is Azure OpenAI GPT-4o. Claude’s role in the production stack remains unclear from public disclosures.

What Does Allianz’s Full AI Suite Look Like?

Azure is the primary cloud platform for AI inference. GPT-4o via Azure OpenAI provides the LLM layer for BRIAN and AllianzGPT. AllianzGPT, the internal platform, has over 60,000 active users, a 95% weekly engagement rate, and has processed over 10 million prompts since launch. Employees have created over 30,000 custom agents on the platform, which suggests a model closer to an internal ChatGPT with plugins than a single monolithic assistant.

Clearspeed handles voice analytics for Incognito’s telephone interview layer. Carpe Data provides the external data signals (social media, public records). Accenture is the primary consulting and integration partner across all three systems. Microsoft Copilot is deployed to approximately 68,000 employees for productivity use cases separate from AllianzGPT.

The Anthropic partnership (confirmed January 2026) is expanding Claude’s presence in the stack, though no separately announced production deployments for specific use cases have been described. Neither AWS nor Google Cloud appear in public disclosures in the context of Allianz’s AI infrastructure, making Azure the dominant platform for this layer of the stack.

Can You Replicate This With Open-Source Tools?

The architecture of each system is standard, and all three are reproducible with open-source components. The differentiating factor in each case is data, not technology.

For fraud detection, the open-source starting point is XGBoost or LightGBM for the supervised classification layer (GitHub: dmlc/xgboost, MIT). A worked insurance fraud detection example with feature engineering and model evaluation is available at GitHub: lokeshch185/AI-driven-insurance-fraud-detection. The voice analytics layer (Clearspeed equivalent) is harder to replicate without proprietary training data. SpeechBrain (GitHub: speechbrain/speechbrain) provides an open-source speech processing framework with emotion and paralinguistic feature extraction, though it is not a drop-in replacement. The external data signal layer (Carpe Data equivalent) requires either commercial data feeds or building your own ingestion pipeline from public sources.

For RAG-based document retrieval (BRIAN equivalent), the standard open-source stack is LlamaIndex (GitHub: run-llama/llama_index, MIT) for document ingestion and retrieval orchestration, ChromaDB (GitHub: chroma-core/chroma, Apache 2.0) or Qdrant (GitHub: qdrant/qdrant, Apache 2.0) for vector storage, and any hosted LLM for generation. A worked insurance document QA example using LlamaIndex and LangChain is at GitHub: SandeepGitGuy/Insurance_Documents_QA_Chatbot_RAG_LlamaIndex_LangChain. The critical variable is document preparation: chunking strategy, metadata tagging, and handling of tables and structured policy language all significantly affect retrieval quality. Budget more time for document preprocessing than for model selection.

For multi-agent claims automation (Project Nemo equivalent), LangGraph (GitHub: langchain-ai/langgraph, Apache 2.0) is the natural framework for the Supervisor orchestration pattern with specialized sub-agents. LangGraph surpassed CrewAI in GitHub stars in early 2026 and has more production case studies at comparable scope. The agent boundary design, specifically what each agent can and cannot access, is more important than the framework choice. Agents with overly broad tool access in financial workflows create audit and compliance problems that are expensive to untangle after deployment.

The realistic implementation constraint for all three systems is the same: backend integration depth. Allianz’s fraud system is useful because it has access to claims history, policy data, and payment records in real time. BRIAN is useful because someone spent significant effort chunking, tagging, and maintaining a 70-document corpus. Project Nemo works because it is wired directly into settlement authorization. The open-source patterns are transferable. The integration surface area is not.