AI Code Generation
6 episodes — 90-second audio overviews on ai code generation.

Coding benchmarks — HumanEval, SWE-bench, MBPP
Standard evaluations measuring code generation quality: from simple function completion (HumanEval) to resolving real GitHub issues (SWE-bench).

Repository-level code understanding — beyond single files
Models that navigate imports, call graphs, type systems, and project structure to generate contextually correct changes spanning multiple files.

Code execution feedback — running code to self-correct
Agents that generate code, execute it in a sandbox, read error messages, and iteratively fix bugs until all tests pass — closing the generate-test loop.

Code generation from natural language — describing what you want
Translating English descriptions into working functions, classes, and scripts — the core use case driving AI-assisted software development.

Fill-in-the-middle (FIM) — bidirectional code completion
Training models to predict missing code given both the prefix and suffix context, powering the inline autocomplete experience in editors like Copilot and Cursor.

Code LLMs — models specialized for programming
Codex, CodeLlama, StarCoder, DeepSeek Coder — models trained on massive code corpora that understand syntax, APIs, libraries, and programming patterns.