Free lesson

Classify GenAI failures into five categories: provider, quality, cost, security, and data failures

You will build a `FailureClassifier` that categorizes GenAI system failures into a structured taxonomy. Define five top-level failure categories as Pydantic models: `ProviderFailure` (HTTP 429 rate limits, 503 outages, timeout, malformed response), `QualityFailure` (hallucination detected, faithfulness below threshold, format non-compliance, language drift), `CostFailure` (budget exceeded, unexpected token spike, cache miss storm, batch job cost overrun), `SecurityFailure` (prompt injection detected, PII leakage in response, jailbreak attempt, unauthorized model access), `DataFailure` (embedding drift, retrieval quality drop, stale index, ingestion pipeline failure). Each model includes `failure_id`, `timestamp`, `severity` (P1-P4), `provider`, `detection_method`, `evidence`, and `impact_scope`. Deploy LiteLLM proxy in your vCluster and configure callbacks that intercept every LLM response. Build `classify_failure()` that inspects response headers, status codes, latency, and content to auto-classify failures. Store events in PostgreSQL `failure_events` table.

~25 min read · Free to read — no subscription required.

Build GenAI failure classification system

Introduction

When you operate a GenAI system in production, you quickly discover that an LLM can return HTTP 200 with a perfectly structured response that contains fabricated citations, drifted embeddings, or PII leakage — silent failures that traditional REST monitoring will never flag. Without a shared vocabulary that names each failure mode and routes it to the right runbook, on-call engineers waste hours triaging "the model seems weird today" tickets while real incidents like prompt injection or runaway agent loops compound unnoticed.

By the end of this lesson you'll be able to classify any production GenAI failure signal into one of five canonical categories — provider, quality, cost, security, or data — encode each as a typed Pydantic event, and apply the priority order that downstream alerting depends on.

Key Terminology

Provider Failure — Errors originating from the upstream LLM API (rate limits, 5xx, timeouts, model deprecation). Matters here because these are the only failures with explicit HTTP error codes and form the baseline category against which the silent ones are contrasted.
Quality Failure — Semantic degradation of model output (hallucination, format violation, coherence drop, silent regression after a model update). Matters because the request returns 200 — only an evaluation pipeline can surface it.
Cost Failure — Token consumption or billing that breaches a configured budget (recursive agent loops, tier escalation, prompt bloat). Matters because cost overruns scale linearly with traffic and can drain a monthly budget in minutes.
Security Failure — Adversarial or accidental violations of the system's security boundary (prompt injection, PII leakage, jailbreak, exfiltration). Matters because this is the highest-priority bucket in the classifier — it preempts every other signal.
Data Failure — Corruption or drift in the data layer that feeds GenAI pipelines (embedding drift, retrieval degradation, ingestion gaps). Matters because data failures masquerade as quality failures and must route to the reindex runbook, not the eval runbook.

Concepts

The five-category taxonomy

Production GenAI systems fail across five distinct axes. Each axis branches into specific failure modes that demand different detection signals and different mitigation runbooks. Provider failures trigger failover; quality failures trigger evaluation pipelines; cost failures trigger circuit breakers; security failures trigger lockdown; data failures trigger reindex workflows. Classifying every observable symptom into exactly one of these five buckets is what makes the rest of your operations stack tractable (see Code Walkthrough).

Loading diagram...

Silent vs explicit failures

A REST API either returns a 200 or it does not. A GenAI request can return 200 and still be wrong. Provider failures are the only category that produces explicit error codes; the other four are silent by default — the model "succeeds" while violating cost, security, quality, or data invariants. This asymmetry is the reason you cannot reuse classical observability primitives unchanged: a success_rate metric defined over HTTP status codes will hide most of your real failures.

Priority ordering across overlapping categories

Categories overlap in the real world. A provider outage that triggers automatic failover to a more expensive model is simultaneously a provider failure and a cost failure. An embedding drift event that degrades retrieval is simultaneously a data failure and a quality failure. To page exactly one runbook per incident, the classifier imposes a total ordering: security → provider → cost → quality → data. Security wins because unblocked prompt injection compromises the entire system; provider wins next because a 5xx makes quality and cost assessments meaningless for that request; data is last because it surfaces through a background drift scanner rather than per-request signals (see Code Walkthrough).

Code Walkthrough

The snippet below demonstrates two concepts together: encoding the taxonomy as typed Pydantic models (one base class plus five subclasses, one per category) and applying the priority-ordered classifier that maps a raw gateway signal into exactly one typed event. Using Pydantic rather than plain dicts means an invalid severity, a negative confidence score, or an HTTP status outside 400-599 raises ValidationError at construction — so malformed events never reach the alerting pipeline.

Code snippetpython
1from enum import Enum
2from datetime import datetime
3from typing import Optional
4from pydantic import BaseModel, Field, field_validator
5
6class FailureCategory(str, Enum):
7    PROVIDER = "provider"
8    QUALITY = "quality"
9    COST = "cost"
10    SECURITY = "security"
11    DATA = "data"
12
13class Severity(str, Enum):
14    LOW = "low"
15    MEDIUM = "medium"
16    HIGH = "high"
17    CRITICAL = "critical"
18
19class FailureEvent(BaseModel):
20    timestamp: datetime = Field(default_factory=datetime.utcnow)
21    provider: str = Field(..., description="e.g. openai, anthropic, google")
22    category: FailureCategory
23    severity: Severity
24    description: str = Field(..., min_length=10)
25    request_id: Optional[str] = None
26    model_id: Optional[str] = None
27
28    @field_validator("provider")
29    @classmethod
30    def normalize_provider(cls, v: str) -> str:
31        return v.strip().lower()
32
33class ProviderFailure(FailureEvent):
34    category: FailureCategory = FailureCategory.PROVIDER
35    http_status_code: Optional[int] = Field(None, ge=400, le=599)
36    retry_after_seconds: Optional[float] = Field(None, ge=0)
37    is_regional_outage: bool = False
38
39class QualityFailure(FailureEvent):
40    category: FailureCategory = FailureCategory.QUALITY
41    confidence_score: float = Field(..., ge=0.0, le=1.0)
42    failure_subtype: str  # hallucination, format_violation, coherence_drop
43
44class CostFailure(FailureEvent):
45    category: FailureCategory = FailureCategory.COST
46    token_count: int = Field(..., gt=0)
47    estimated_cost_usd: float = Field(..., ge=0.0)
48    budget_limit_usd: float = Field(..., gt=0.0)
49    overage_pct: float = Field(..., ge=0.0)
50
51class SecurityFailure(FailureEvent):
52    category: FailureCategory = FailureCategory.SECURITY
53    threat_type: str  # prompt_injection, pii_leakage, jailbreak, exfiltration
54    blocked: bool = True
55    matched_pattern: Optional[str] = None
56
57class DataFailure(FailureEvent):
58    category: FailureCategory = FailureCategory.DATA
59    drift_score: Optional[float] = Field(None, ge=0.0, le=1.0)
60    affected_collection: str
61    documents_affected: int = Field(0, ge=0)
62
63def classify_failure(
64    raw_signal: dict,
65    provider: str,
66    model_id: str,
67    request_id: str,
68) -> FailureEvent | None:
69    """Map a raw gateway signal to exactly one typed failure event."""
70    # Priority 1: security (always wins)
71    if raw_signal.get("injection_detected") is True:
72        return SecurityFailure(
73            provider=provider, model_id=model_id, request_id=request_id,
74            severity=Severity.CRITICAL,
75            description=f"Prompt injection: {raw_signal['matched_rule']}",
76            threat_type="prompt_injection",
77            blocked=raw_signal.get("request_blocked", True),
78            matched_pattern=raw_signal.get("matched_rule"),
79        )
80
81    # Priority 2: provider (HTTP error)
82    status = raw_signal.get("http_status")
83    if status is not None and status >= 400:
84        return ProviderFailure(
85            provider=provider, model_id=model_id, request_id=request_id,
86            severity=Severity.CRITICAL if status >= 500 else Severity.HIGH,
87            description=f"Provider returned HTTP {status}",
88            http_status_code=status,
89            retry_after_seconds=raw_signal.get("retry_after"),
90            is_regional_outage=raw_signal.get("regional_outage", False),
91        )
92
93    # Priority 3: cost (budget breach)
94    budget = raw_signal.get("budget_limit_usd", 0.0)
95    cost = raw_signal.get("estimated_cost_usd", 0.0)
96    if budget > 0 and cost > budget:
97        overage = ((cost - budget) / budget) * 100
98        return CostFailure(
99            provider=provider, model_id=model_id, request_id=request_id,
100            severity=Severity.HIGH if overage > 50 else Severity.MEDIUM,
101            description=f"Cost ${cost:.2f} exceeds budget ${budget:.2f}",
102            token_count=max(raw_signal.get("total_tokens", 1), 1),
103            estimated_cost_usd=cost,
104            budget_limit_usd=budget,
105            overage_pct=overage,
106        )
107
108    # Priority 4: quality (hallucination)
109    h_score = raw_signal.get("hallucination_score")
110    if h_score is not None and h_score > 0.7:
111        return QualityFailure(
112            provider=provider, model_id=model_id, request_id=request_id,
113            severity=Severity.HIGH if h_score > 0.9 else Severity.MEDIUM,
114            description=f"Hallucination score={h_score:.2f}",
115            confidence_score=h_score,
116            failure_subtype="hallucination",
117        )
118
119    # Priority 5: data failures arrive asynchronously from a drift scanner
120    return None

You'll know it works when constructing a QualityFailure with confidence_score=1.5 raises ValidationError, and when classify_failure({"injection_detected": True, "matched_rule": "ignore-prev", "http_status": 500}, ...) returns a SecurityFailure (not a ProviderFailure) — security always preempts.

Do's and Don'ts

Do's

✓Do classify every gateway response — even successful 200s — through classify_failure, so silent quality, cost, and data failures are not lost in the noise of "successful" requests.
✓Do encode severity and bounds at construction time with Pydantic constraints — invalid events should fail fast at the boundary, never reach the alerting pipeline.
✓Do preserve the security → provider → cost → quality → data priority order when extending the classifier; runbook routing depends on a single primary category per event.

Don'ts

✗Don't treat HTTP 200 as success — four of the five failure categories return 200 by default; only ProviderFailure is visible at the HTTP layer.
✗Don't merge data and quality failures into a single "model is bad" bucket — they route to different runbooks (reindex vs. eval pipeline) and conflating them delays the real fix.
✗Don't add a sixth top-level category to relieve classification pressure; refine the existing five with subtypes (e.g., QualityFailure.failure_subtype) instead.

Keep going with GenAI Inference Engineering

Create a free account to track your progress and open this lesson in the full learning view. Subscribe to unlock the entire path — every goal, the hands-on labs, quizzes, and your verifiable skill graph — from . Cancel anytime.

Create a free account Subscribe — →