GenAI Disciplines Docs Bytes About

Account

AI Basics

36 episodes — 90-second audio overviews on ai basics.

1:58

Why diffusion won — comparing generative architectures

Diffusion models offer stable training, mode coverage, better diversity, and higher fidelity than GANs, which is why they replaced GANs as the dominant approach for image and video generation.

Image GenerationAI BasicsGenerative AIGenAI Explained2026-02-19

1:40

Normalizing Flows — invertible generation with exact likelihoods

Chains of invertible mathematical transformations that map simple distributions to complex ones, offering exact probability computation unlike GANs or VAEs.

Image GenerationAI BasicsGenerative AIGenAI Explained2026-02-19

1:55

GAN challenges — mode collapse and training instability

GANs are notoriously difficult to train: the generator may produce limited variety (mode collapse), and the adversarial balance is fragile and sensitive to hyperparameters.

Image GenerationAI BasicsGenerative AIGenAI Explained2026-02-19

1:25

GAN applications — StyleGAN, deepfakes, super-resolution

GANs powered photorealistic face generation (StyleGAN), image enhancement (ESRGAN), and synthetic media — the dominant GenAI paradigm before diffusion.

Image GenerationAI BasicsGenerative AIGenAI Explained2026-02-19

1:19

GANs — generator vs discriminator competition

Two networks in adversarial training: a generator creates fakes, a discriminator detects them — the competition drives both to improve, producing increasingly realistic outputs.

Image GenerationAI BasicsGenerative AIGenAI Explained2026-02-19

1:39

Variational Autoencoders (VAEs) — generating from learned distributions

Unlike basic autoencoders, VAEs encode inputs as probability distributions, enabling smooth interpolation between examples and sampling of entirely new outputs.

Image GenerationAI BasicsGenerative AIGenAI Explained2026-02-19

1:40

Latent space — the compressed world where generation happens

The bottleneck layer in an autoencoder where high-dimensional data (images, text) is compressed into a dense, navigable, lower-dimensional representation.

Image GenerationAI BasicsGenerative AIGenAI Explained2026-02-19

1:38

Autoencoders — compressing and reconstructing data

Neural networks that learn to encode input into a compact bottleneck representation and decode it back — the architectural foundation of latent space.

Image GenerationAI BasicsGenerative AIGenAI Explained2026-02-19

1:28

SwiGLU & modern activations — inside frontier transformers

SwiGLU replaces older ReLU in modern transformers (LLaMA, Mistral), providing smoother gradients and measurably better training dynamics.

TransformersAI BasicsGenerative AIGenAI Explained2026-02-18

1:18

The attention bottleneck — O(n²) cost of full attention

Attention scales quadratically with sequence length; a 100K-token input requires 10 billion attention pair computations per layer.

TransformersAI BasicsGenerative AIGenAI Explained2026-02-18

1:10

Causal masking — why decoders can't peek ahead

Future tokens are masked during training so each position only attends to past tokens, enabling left-to-right autoregressive generation.

TransformersAI BasicsGenerative AIGenAI Explained2026-02-18

1:35

Encoder vs decoder vs encoder-decoder

BERT uses an encoder (understanding), GPT uses a decoder (generation), T5 uses both — different configurations optimized for different GenAI tasks.

TransformersAI BasicsGenerative AIGenAI Explained2026-02-18

1:30

Residual connections & layer norm — stability for deep models

Skip connections add each sub-layer's input to its output, and normalization prevents values from exploding, enabling stable 100+ layer training.

TransformersAI BasicsGenerative AIGenAI Explained2026-02-18

1:49

Feed-forward networks — per-token transformation after attention

After attention mixes information across tokens, independent feed-forward layers transform each token's representation with nonlinear activation functions.

TransformersAI BasicsGenerative AIGenAI Explained2026-02-18

1:41

Multi-head attention — parallel perspectives on the same input

Multiple attention mechanisms run simultaneously, each learning to capture different relationship types like syntax, semantics, and coreference.

TransformersAI BasicsGenerative AIGenAI Explained2026-02-18

1:28

Query, Key, Value — the three vectors of attention

Tokens generate Q, K, V projections; attention scores come from Q·K dot-product similarity, and the output is V weighted by those scores.

TransformersAI BasicsGenerative AIGenAI Explained2026-02-18

1:24

Self-attention — every token looks at every other

Each token computes relevance scores against all other tokens, capturing long-range dependencies in a single parallel computation step.

TransformersAI BasicsGenerative AIGenAI Explained2026-02-18

1:37

The Transformer — the engine of modern GenAI

Published in 2017's "Attention Is All You Need," this architecture replaced recurrent networks and became the foundation of every frontier GenAI model.

TransformersAI BasicsGenerative AIGenAI Explained2026-02-18

1:52

Token economics — why every token has a price

API providers charge per input and output token; understanding tokenization directly impacts cost estimation, prompt design, and budget optimization.

AI BasicsAI TokenizationGenerative AIGenAI Explained2026-02-18

1:26

Positional encoding — teaching word order to parallel models

Since transformers process all tokens simultaneously, position must be explicitly injected via sinusoidal functions or learned embeddings.

AI BasicsAI TokenizationGenerative AIGenAI Explained2026-02-18

1:37

Word embeddings — turning tokens into vectors

Each token maps to a learned high-dimensional vector where semantic proximity in space encodes similarity in meaning.

AI BasicsAI TokenizationGenerative AIGenAI Explained2026-02-18

1:38

Special tokens — control signals for models

\[BOS\], \[EOS\], \[PAD\], \<\|im\_start\|\>, \<tool\_call\> — reserved tokens that mark boundaries, roles, and structure for the model.

AI BasicsAI TokenizationGenerative AIGenAI Explained2026-02-18

1:34

Vocabulary size tradeoffs — why 32K, 50K, or 100K tokens

Larger vocabularies produce fewer tokens per text (cheaper inference) but require bigger embedding tables and more parameters to train.

AI BasicsAI TokenizationGenerative AIGenAI Explained2026-02-18

1:54

SentencePiece & tiktoken — tokenizer implementations

SentencePiece (Google) and tiktoken (OpenAI) are the standard libraries for fast, language-agnostic tokenization used across model families.

AI BasicsAI TokenizationGenerative AIGenAI Explained2026-02-18

1:33

Byte-Pair Encoding (BPE) — how tokenizers learn to split text

Starting from individual bytes or characters, BPE iteratively merges the most frequent adjacent pairs until reaching a target vocabulary size.

AI BasicsAI TokenizationGenerative AIGenAI Explained2026-02-18

1:21

What are tokens — the atoms of language models

Models don't see words or characters; they see tokens — subword units that balance vocabulary size with text coverage.

AI BasicsAI TokenizationGenerative AIGenAI Explained2026-02-18

1:32

GenAI timeline — from GPT-1 to today's frontier

A chronological tour: GPT-1 (2018), GPT-3 (2020), DALL-E (2021), ChatGPT (2022), GPT-4 and Claude (2023), multimodal omni models (2024-25).

AI BasicsGenerative AIGenAI ExplainedAI Podcast2026-02-18

1:24

The GenAI stack — hardware, models, orchestration, apps

From GPU clusters at the bottom to model weights to orchestration frameworks to end-user apps at the top — the full technology stack powering GenAI.

AI BasicsGenerative AIGenAI ExplainedAI Podcast2026-02-18

1:42

Closed vs open models — APIs vs downloadable weights

OpenAI and Anthropic offer API access; Meta and Mistral release weights — each path has different tradeoffs in cost, control, privacy, and customization.

AI BasicsGenerative AIGenAI ExplainedAI Podcast2026-02-18

1:36

Parameters — the learned numbers inside a model

Each parameter is a single number learned during training; modern GenAI models have billions, collectively encoding everything the model knows.

AI BasicsGenerative AIGenAI ExplainedAI Podcast2026-02-18

1:35

The training-inference split — building the brain vs using it

Training costs millions of dollars and takes weeks on thousands of GPUs; inference serves billions of requests cheaply — two fundamentally different engineering problems.

AI BasicsAI InferenceGenerative AIGenAI Explained2026-02-18

1:41

How GenAI generates — one token or step at a time

Text models predict the next token autoregressively; image models denoise step by step — both are iterative generation processes.

AI BasicsGenerative AIGenAI ExplainedAI Podcast2026-02-18

1:07

Foundation models — one model, many tasks

Massive models pre-trained on broad data that can be adapted to countless downstream tasks without retraining from scratch.

AI BasicsGenerative AIGenAI ExplainedAI Podcast2026-02-18

1:29

The GenAI modality map — text, image, audio, video, code, 3D

A survey of every output type GenAI can produce today and the distinct model families that power each modality.

AI BasicsGenerative AIGenAI ExplainedAI Podcast2026-02-18

1:28

How GenAI differs from traditional AI — generation vs classification

Traditional ML sorts, ranks, and predicts from fixed categories; GenAI synthesizes novel outputs by sampling from a learned distribution of possibilities.

AI BasicsGenerative AIGenAI ExplainedAI Podcast2026-02-18

1:18

What is generative AI — models that create new content

Unlike traditional AI that classifies or predicts, GenAI produces entirely new text, images, code, and audio from learned patterns.

AI BasicsGenerative AIGenAI ExplainedAI Podcast2026-02-18