All Topics

AI Safety

5 episodes — 90-second audio overviews on ai safety.

Overrefusal — when safety makes models too cautious
1:26

Overrefusal — when safety makes models too cautious

Excessive safety training causes refusal of clearly benign requests; calibrating the refusal boundary without compromising safety is a key alignment challenge.

AI SafetyAI AlignmentGenerative AIGenAI Explained2026-02-19
Hallucination mitigation — grounding, retrieval, verification
1:37

Hallucination mitigation — grounding, retrieval, verification

RAG, self-consistency checks, citation requirements, confidence calibration, and retrieval verification reduce but never fully eliminate hallucination.

AI SafetyAI AlignmentGenerative AIGenAI Explained2026-02-19
Why hallucinations happen — probability meets knowledge gaps
1:45

Why hallucinations happen — probability meets knowledge gaps

Models assign probability to all possible tokens including wrong ones; gaps in training data and distributional shift make some fabrication inevitable.

AI SafetyAI AlignmentGenerative AIGenAI Explained2026-02-19
Types of hallucination — intrinsic vs extrinsic
1:53

Types of hallucination — intrinsic vs extrinsic

Intrinsic hallucinations contradict the provided input; extrinsic hallucinations add unsupported claims from parametric memory — both undermine user trust.

AI SafetyAI AlignmentGenerative AIGenAI Explained2026-02-19
Hallucination — when GenAI confidently fabricates information
1:16

Hallucination — when GenAI confidently fabricates information

Models generate plausible but factually wrong content because they optimize for fluency and pattern completion, not truth or accuracy.

AI SafetyAI AlignmentGenerative AIGenAI Explained2026-02-19