GenAI Agent Engineering

Build autonomous multi-agent systems with planning, reasoning, tool use, memory, MCP/A2A protocols, safety boundaries, and production evaluation.

Preview 10 goals free

13 skill groups5 courses790 goals~408 hrs

Verifiable skill graph

13 skill groups · each becomes a signed node on your graph.

Every lab you pass signs a W3C Verifiable Credential on your public skill graph. Completing the labs in each group below mints one node on that graph — the badge you walk away with is a cryptographic record of what you can ship, not a completion certificate.

Share the URL on your résumé or with a hiring manager. They click; they see the discipline, the labs you passed, and the verification signature. No honor system, no broker.

Tool Use & MCP

The irreducible core of an agent: function calling, Pydantic-typed tool definitions, multi-tool and parallel calls, sandboxed/safe tool execution, and MCP servers + clients.

Reasoning & Planning Patterns

Agent reasoning and planning patterns: ReAct, planner-executor, reflection/self-critique, chain-of-thought, task decomposition, and dynamic re-planning.

Agent Evaluation & Testing

The scientific method for non-deterministic agents: trajectory and final-outcome scoring, offline eval sets, LLM-as-judge, regression suites, benchmarks, and test harnesses.

Reliability & Failure Recovery

What separates a demo from production: step-level retries, circuit breakers, fallback chains, graceful degradation, structured-output validation + auto-repair, idempotency, and replay/recovery.

Durable Execution & Control Flow

Developer-authored agent control flow: state graphs, conditional edges, checkpointers and time-travel, human-in-the-loop gates, streaming, and subgraph composition.

Context Engineering & Memory

The core craft of long-running agents: context-window engineering (compaction, pruning, managing context rot), long-term/semantic/episodic memory, summarization, and retrieval/RAG as a tool.

Multi-Agent Coordination & Handoffs

Coordinating multiple agents when one isn't enough: supervisor/router patterns, hierarchical organizations, agent-to-agent handoffs and delegation, and result aggregation.

Agent Safety & Guardrails

Keeping autonomous agents within bounds: input/output guardrails, prompt-injection defense, PII scrubbing, content moderation, action allow/deny boundaries, and policy enforcement.

Operations, Observability & Cost Control

Running agents in production: tracing (OpenTelemetry/Langfuse/Logfire), step-level logging, fleet dashboards and alerting, per-agent token budgets, loop-termination on budget, and versioning/rollback.

Specialized Agent Environments

Agents that act in an external environment — computer-use, web-browsing, and code agents — with screenshotting, action verification, and environment recovery.

Agent Deployment & Serving

Shipping agents as services: HTTP/WebSocket endpoints (FastAPI), containerization, deploy + autoscale, and CI/CD for agent code.

LLM API Foundations

Baseline LLM access from agent code: OpenAI/Anthropic/Gemini SDK calls, auth, structured outputs, sampling control, and unified multi-provider interfaces.

Python for Agent Engineering

Production-grade Python applied to agents: async/asyncio, type hints, Pydantic, dataclasses, decorators, context managers — the language fluency to ship agent code.

What you'll ship in production

Core responsibilities this discipline prepares you for.

1
Design autonomous GenAI agents
using state machines with tool calling, memory, and planning
- Build LangGraph agents from scratch: define graph nodes, conditional edges, state schemas, and checkpointing
- Progress from simple ReAct agents → planning agents → multi-step agents with persistent memory
- Apply state machine theory to design agent graphs for complex, real-world task scenarios
2
Build multi-agent systems
with supervisor/worker hierarchies, delegation, and parallel execution
- Implement supervisor agent patterns that route tasks to specialist worker agents
- Construct hierarchical team structures with dynamic agent spawning and swarm coordination
- Monitor cross-agent execution with delegation rules and parallel task orchestration
3
Implement MCP servers and clients
for standardized tool integration
- Build Model Context Protocol servers that expose REST APIs as discoverable agent tools
- Implement MCP clients in LangGraph agents with dynamic tool registration and schema negotiation
- Validate tool selection accuracy across diverse query types and measure invocation reliability
4
Enable agent-to-agent communication
using A2A protocol for cross-framework interoperability
- Implement A2A v0.3 protocol mechanics: Agent Cards, task lifecycle management, and gRPC transport
- Build A2A-compatible agents using Google ADK with capability advertising
- Verify cross-framework interoperability between independently built agent systems
5
Build production RAG agents
with iterative retrieval, self-verification, and query decomposition
- Add vector search nodes to LangGraph agent graphs with quality-checked retrieval loops
- Implement query decomposition for complex multi-part questions with iterative refinement
- Benchmark agentic RAG against static RAG pipelines using faithfulness and relevance metrics
6
Implement guardrails and safety controls
within agent workflows
- Integrate NeMo Guardrails for content filtering within running agent execution loops
- Add LlamaFirewall middleware with policy-based tool access control and output filtering
- Quantify safety-vs-helpfulness tradeoffs using adversarial test suites and scoring rubrics
7
Evaluate agent performance
with trajectory analysis and cost tracking
- Build evaluation harnesses measuring trajectory quality, tool selection accuracy, and task completion
- Run agents against standardized test suites and analyze per-task token cost attribution
- Track agent quality regressions over time with Langfuse observability dashboards
8
Design context engineering
— systematic composition of prompts, memory, tools, and history
- Structure system prompts, conversation memory windows, and tool result formatting strategies
- Optimize context window utilization across multi-turn conversations with token budgeting
- Measure agent behavior differences across context designs using controlled A/B evaluations

Curriculum

5 courses · each builds on previous goals

10 goals unlocked for preview — click to read. Locked goals need a subscription.

CourseGoals

Python Essentials for Agent Builders62

Your Dev Environment4

Navigate filesystem with terminal
Manage files from command line
Set up VS Code
Configure terminal in VS Code

Python, Git & Package Management6

Install and verify Python
Write hello world script
Use Python REPL
Initialize Git repository
Track changes with Git
Install packages with pip

Variables & Basic Types5

Create and name variables
Work with strings
Work with numbers
Work with booleans
Format with f-strings

Control Flow4

Make decisions with if/elif/else
Iterate with for loops
Repeat with while loops
Control loop execution

Functions5

Define and call functions
Use parameters
Return values
Document with docstrings
Understand scope

Modules & Imports4

Import standard library
Create custom modules
Understand Python path
Create packages

Lists & Tuples5

Create and access lists
Modify lists
Slice lists
Use list comprehensions
Work with tuples

Dictionaries & Sets5

Create and access dicts
Modify dictionaries
Iterate over dicts
Work with nested dicts
Use sets

Classes & Dataclasses5

Understand class basics
Create dataclasses
Add methods
Use default values
Basic inheritance

Files, JSON & Error Handling5

Read and write files
Work with JSON
Use pathlib
Handle exceptions
Create custom exceptions

Basic Testing4

Use assert statements
Create test functions
Run pytest
Test classes

Environment Variables & Configuration5

Understand environment variables
Use .env files
Load with python-dotenv
Handle missing variables
Organize configuration

Decorators & Context Managers5

Understand decorators
Write simple decorators
Use context managers
Write context managers
Combine patterns

LLM Foundations for Agent Builders80

Generators & Iterators5

Understand iteration
Create generators
Use generator expressions
Build data pipelines
Use itertools

Async Programming Basics5

Understand async concepts
Write async functions
Run concurrent operations
Use async context managers
Handle async exceptions

Type Hints & Pydantic5

Add basic type hints
Use typing generics
Create Pydantic models
Validate API data
Configure Pydantic

Data Pipelines & Transformations5

Build functional pipelines
Work with tabular data
Transform data shapes
Process LLM data formats
Optimize for performance

HTTP Clients & httpx5

Make GET requests
Make POST requests
Use async httpx
Handle errors
Use sessions

Your First LLM Call5

Set up credentials
Install Gemini SDK
Make first API call
Parse response
Handle API errors

Transformer Layer Anatomy5

Understand self-attention
Compare attention types
Explore layer structure
Trace through layers
Understand depth vs width

FFN Variants & Activation Functions5

Understand FFN basics
Compare activations
Explore gated variants
Analyze modern FFNs
Connect to MoE

Alternative Architectures - SSMs & Hybrids5

Understand SSM basics
Explore Mamba
Understand RWKV
Explore hybrid models
Compare architectures

Sampling Parameters & Output Control5

Understand temperature
Use top-p sampling
Implement determinism
Control output length
Use structured output

Multi-Provider & Prompt Engineering5

Build provider abstraction
Structure conversations
Use few-shot prompting
Implement chain-of-thought
Build prompt templates

Function Calling Fundamentals5

Understand tool use concept
Define tool schemas
Make function calls
Handle tool responses
Compare provider patterns

Embeddings & Semantic Search5

Understand embeddings
Generate embeddings
Calculate similarity
Build simple search
Compare embedding models

RAG Fundamentals5

Understand RAG pattern
Chunk documents
Build retrieval pipeline
Compose RAG prompts
Evaluate RAG quality

Cost Awareness & Token Economics5

Understand pricing models
Calculate request costs
Compare provider costs
Identify cost drivers
Basic cost optimization

Retry Patterns with Tenacity5

Understand retry need
Use tenacity basics
Implement exponential backoff
Handle specific exceptions
Combine with async

Kubernetes Essentials for GenAI48

Containerizing LLM Applications6

Write a Python app that calls the Gemini API and returns structured responses
Write a Dockerfile and build a container image for the LLM app
Run the containerized LLM app with environment-based configuration
Use Docker Compose to run the LLM app with supporting services
Tag images with semantic versions and push to a container registry
Debug containers with exec, logs, and inspect

Your Kubernetes Cluster & First LLM Pod6

Understand K8s architecture and connect to your vCluster
Deploy the LLM app as your first Kubernetes pod
Organize workloads with namespaces
Use labels and selectors to organize and query resources
Understand pod lifecycle and restart policies
Master kubectl debugging: exec, logs, describe, port-forward

Services & the LLM Chat Backend6

Create a ClusterIP service to expose the LLM chat API internally
Deploy a multi-tier LLM chat application
Compare service types: ClusterIP, NodePort, LoadBalancer
Master DNS-based service discovery in Kubernetes
Understand endpoints and traffic routing
Debug service connectivity problems

Deployments, Scaling & Rolling Updates6

Create a Deployment for the LLM chat API
Scale LLM app replicas to handle concurrent requests
Perform a rolling update with zero downtime
Roll back a broken deployment
Compare deployment strategies: RollingUpdate vs Recreate
Manage deployment lifecycle with kubectl rollout

Packaging with Helm & Kustomize6

Create a Helm chart for the LLM chat application
Parameterize the chart with values.yaml for each environment
Manage Helm release lifecycle: install, upgrade, rollback
Use Kustomize bases and overlays for the LLM app
Use Kustomize patches and generators
Compare Helm vs Kustomize for different deployment scenarios

Networking, Ingress & TLS6

Expose the LLM chat API via an Ingress resource
Add TLS to the Ingress for HTTPS access
Isolate services with NetworkPolicies
Configure Ingress annotations for production traffic
Understand K8s networking: pod IPs, CNI, and service routing
Debug networking and connectivity issues

Health Probes, Autoscaling & Self-Healing6

Add liveness and readiness probes to the LLM chat API
Configure startup probes for containers with slow initialization
Scale the chat API automatically with HPA based on CPU
Create PodDisruptionBudgets for safe maintenance
Implement health check patterns for LLM-dependent services
Combine autoscaling, probes, and PDBs for a resilient LLM service

RBAC, Security & K8s Troubleshooting6

Create RBAC roles for the LLM chat application
Enforce Pod Security Standards
Apply SecurityContext for defense in depth
Debug CrashLoopBackOff and OOMKilled failures
Use kubectl debug and ephemeral containers for live debugging
Troubleshoot LLM-specific issues: timeouts, proxy errors, stale connections

Web APIs for GenAI Engineers60

FastAPI Fundamentals6

Create a FastAPI application with path operations
Define Pydantic request and response models
Implement dependency injection for shared resources
Build CRUD endpoints with proper HTTP semantics
Configure OpenAPI documentation with examples
Handle errors with custom exception handlers

Async Python for APIs6

Convert sync endpoints to async with proper await patterns
Implement background tasks for non-blocking operations
Execute concurrent API calls with asyncio.gather
Manage application lifecycle with lifespan handlers
Build async generators for streaming responses
Control concurrency with semaphores and throttling

Database Integration6

Configure SQLAlchemy async engine with connection pooling
Define ORM models with relationships and constraints
Create and manage database migrations with Alembic
Implement repository pattern for data access
Build transactional endpoints with session lifecycle
Implement filtering, sorting, and full-text search

Authentication & Authorization6

Implement user registration with password hashing
Build OAuth2 password flow with JWT tokens
Implement API key authentication for services
Enforce role-based access control with permissions
Build token refresh and revocation
Compose multiple auth strategies into dependencies

Real-time Streaming6

Build SSE endpoint for streaming LLM responses
Implement WebSocket endpoint with connection lifecycle
Build WebSocket connection manager for broadcasting
Handle backpressure and slow clients
Implement heartbeat and automatic reconnection
Build real-time notification system with Redis pub/sub

Resilience Patterns6

Implement rate limiting with Redis sliding window
Build circuit breaker for LLM provider calls
Configure retry logic with tenacity
Isolate critical paths with bulkhead semaphores
Build fallback responses for degraded mode
Combine resilience patterns into middleware stack

API Gateway & Routing6

Build reverse proxy with path-based routing
Implement load balancing across backend instances
Transform requests and responses through the gateway
Aggregate responses from multiple backends
Implement service discovery with health checking
Build gateway authentication and request enrichment

Testing & Documentation6

Write async endpoint tests with httpx.AsyncClient
Build database fixtures with transaction rollback
Mock external services for deterministic tests
Implement contract tests for API consumers
Measure test coverage and set quality gates
Generate rich OpenAPI documentation with examples

API Versioning & Evolution6

Implement URL-based API versioning with routers
Build header-based version negotiation
Manage deprecation with Sunset and Warning headers
Build request and response adapters for version translation
Detect breaking changes automatically
Generate API changelogs from schema diffs

Deployment & Observability6

Build production Docker images with multi-stage builds
Deploy to Kubernetes with health check probes
Instrument endpoints with Prometheus metrics
Implement distributed tracing with OpenTelemetry
Build structured logging with correlation IDs
Create Grafana dashboards for API monitoring

Agent Hosted Models492

The Dev Environment6

Isolate your agent's Python environment
Manage dependencies with Poetry
Master Git for agent projects
Secure API key management
Configure VS Code for agent development
Catch mistakes before commit with pre-commit hooks

The Async Foundation6

Understand the asyncio event loop
Run concurrent calls with asyncio.gather
Manage async resources with `async with`
Handle exceptions, timeouts, and retries in async code
Stream LLM responses with async generators
Throttle concurrent API calls with semaphores

The Type System6

Apply Python type hints to function signatures
Build flexible components with generic types
Define structural interfaces with Protocol
Configure mypy for static type checking
Model structured dictionaries with TypedDict
Validate types at runtime

The Data Validator6

Model agent data structures with Pydantic
Generate JSON Schema for LLM tool definitions
Write custom Pydantic validators
Parse LLM output into Pydantic models
Configure agents with Pydantic Settings
Serialize and persist agent state

The Error Handler6

Design a custom exception hierarchy for agents
Retry with exponential backoff
Advanced retry patterns with `tenacity`
Implement the circuit-breaker pattern
Handle LLM-API-specific errors
Graceful degradation with fallback chains

The Test Writer7

Pytest fundamentals for agents
Mock external dependencies in agent tests
Test async agent code
LLM-as-judge testing
Test coverage for agent code
Advanced testing patterns
Practical use cases — testing tool execution and multi-step workflows

The Debugger6

Debug Python with pdb
Debug agents in VS Code
Debug async agent code
Profile agent performance
Debug agent-specific issues
Practical use cases — debugging multi-step agents

The Logger6

Configure Python logging for agent applications
Implement structured JSON logging
Correlate logs across an agent request with `contextvars`
Log agent tool calls and LLM interactions
Redact sensitive data from agent logs
Configure log output destinations

The HTTP Client6

Why httpx — fundamentals and connection pooling
The httpx Client pattern and request configuration
Handle and validate HTTP responses
Concurrent HTTP requests with rate limiting
HTTP error handling, timeouts, and retry logic
Build HTTP-based agent tools

The Project Structure6

Modules, packages, and component interfaces
Configuration management for agent projects
Dependency injection for agent services
Entry points and runnable scripts
The pyproject.toml file
Practical use cases — wiring it all together

The LLM Client7

OpenAI client setup
Anthropic client setup
Google Gemini client setup
Build a unified LLM client interface
Error handling and provider fallback
Async LLM client patterns
Practical use cases — security, parameters, observability

Token Economics7

Understand tokenization
Count tokens across providers
Cost forecasting and budgeting
Track LLM API usage in production
Implement budget controls
Optimize tokens
Advanced context engineering

Prompt Caching4

Implement Anthropic cache_control
Leverage OpenAI automatic caching
Design cache-friendly prompt architectures
Build cache monitoring systems

The Function Caller7

OpenAI function schemas
Anthropic function schemas
Gemini function schemas
Handle tool call responses
Execute tools safely with Pydantic validation
Handle parallel tool calls
Framework integration with LangGraph

The Tool Definer7

Write clear tool descriptions for LLMs
Define parameter schemas
Use Pydantic for tool schemas
Implement tool decorators
Handle complex parameter types
Validate tool inputs at runtime
Framework tool patterns — LangGraph, CrewAI, OpenAI, Gemini, Anthropic

The Raw Agent Loop7

The core agent while-loop
Manage context as a mutable list
Handle stop sequences
Track iteration limits
Tool execution in the loop
Build a conversation state tracker
Build with LangGraph StateGraph

The Prompt Engineer (Dynamic)6

Master Jinja2 templating for prompts
Implement dynamic few-shot example selection
Enforce Chain-of-Thought reasoning
Structure system prompts with a builder pattern
Inject dynamic context into prompts safely
Build prompt versioning and A/B testing

The ReAct Pattern (Manual)6

Build the Thought-Action generator
Tool execution and observation injection
Complete ReAct agent implementation
Advanced ReAct patterns — validation, retry, confidence
Optimize ReAct performance
Common ReAct pitfalls and solutions

The Planner Pattern7

Plan generation
Step execution
Dynamic replanning
Hierarchical planning
Plan optimization
Monitoring and observability
Practical considerations — strategy selection

The Pydantic Tool7

Pydantic fundamentals for tool definitions
Generate JSON Schema from Pydantic models
Input validation with custom validators
Build a Pydantic tool library
Advanced Pydantic patterns
Integrate Pydantic tools with agent frameworks
Common pitfalls and solutions

The Safe Executor (Sandboxing)5

Understand code execution risks
Static code analysis
Sandboxed execution
Apply resource limits
Build a complete safe executor

The Web Navigator5

Web navigation fundamentals
Web navigation tools — locating elements and forms
Browser automation with Playwright
Session management
Complete web navigator system

The MCP Protocol (Basics)4

JSON-RPC 2.0 message format and handler
Transport mechanisms — stdio and HTTP/SSE
Protocol lifecycle — initialization, runtime, shutdown
Capability negotiation

The MCP Server6

Create an MCP server with lifecycle management
Define MCP tools
Implement MCP resources
Create prompt templates
Error handling in MCP servers
Composable MCP server architecture

The MCP Client6

MCP client architecture and stdio transport
Discover available tools and translate schemas
Proxy tool invocation
Fetch and use MCP resources
Manage MCP server lifecycle
Build multi-server MCP clients

The Tool Router5

Tool routing architecture and implementation
Namespace-based routing
Capability-based routing
Fallback chains
Routing performance optimization

Short-Term Memory8

Sliding window memory
Token-aware memory management
Message summarization strategies
Memory persistence layers
Memory retrieval optimization
Integrate memory with agents
Memory performance considerations
Non-functional requirements (privacy + safety)

Long-Term Memory (RAG)6

Document chunking strategies
Embedding pipelines
Vector database integration
Hybrid search implementation
Retrieval optimization
RAG response generation

Agentic RAG Patterns5

Self-reflective RAG
Multi-hop retrieval
Query routing
Adaptive retrieval
Retrieval feedback loops

Semantic Memory6

Knowledge extraction pipelines
Entity and relationship extraction
Knowledge graph construction
Memory consolidation
Integrate semantic memory with agents
Build semantic memory with LangGraph

Context Optimizer6

Context economics
Dynamic context prioritization
Context compression techniques
Prompt optimization
Context utilization metrics
Complete context optimizer

The State Graph5

StateGraph fundamentals — config and lifecycle
Design state schemas with TypedDict
Add nodes to StateGraph
State initialization patterns
Tracing, debugging, validation

The Conditional Edge5

Understand conditional edges
Design routing functions
Fan-out and fan-in patterns
Handle unknown routes and errors
Multi-stage routing

The Checkpointer (Time Travel)4

Resumable workflows
Inspect, replay, and time-travel
Retention, large state, and performance
Thread management — IDs and namespaces

Human-in-the-Loop6

LangGraph interrupt patterns
Approval workflow patterns
Interactive agent conversations
Feedback integration
State management for HITL
Practical use cases — escalation and analytics

The Streaming Agent6

Streaming modes in LangGraph
Token streaming from LLMs
Custom events with `astream_events`
Build streaming APIs
Error handling in streams
Backpressure and flow control

The Subgraph (Composition)7

Subgraph fundamentals — compile + test in isolation
State schema mapping
Subgraph checkpointers + namespace isolation
Compose subgraphs into a parent
Catch subgraph exceptions and recover
Define subgraph interfaces and build a registry
Build a multi-agent orchestrator

The Supervisor Pattern7

Design supervisor architectures
Worker agent specialization
Build the complete supervisor graph
Manage inter-agent communication
Handle failures and edge cases
Implement task aggregation
Build the supervisor pattern with CrewAI

The Hierarchical Pattern4

Design hierarchical agent architectures
Implement team-lead agents
Build cross-team coordination
Build the complete hierarchical graph

The Reflector Pattern (Critique)6

Design reflection architectures
Implement critic agents
Build the evaluation and convergence system
Build the complete reflection graph
Handle reflection edge cases
Practical use cases for reflection

Input Guardrails6

Design layered guardrail architectures
Format and schema validation
Build content filtering systems
Create injection / jailbreak detection
Implement policy-based guardrails
Assemble the complete guardrail system

Output Guardrails6

Design output validation architectures
Implement factual validation (hallucination detection)
Build content safety filters
Create PII redaction
Implement policy compliance
Assemble the complete output guardrail system

Prompt Injection Defense7

Identify injection vulnerabilities
Detect direct injections
Detect indirect injections
Implement defense layers
Build red-team suites
Implement canary tokens
LangGraph injection defense pipeline

Evaluations (Evals)6

Design evaluation frameworks
Implement automated evaluation pipelines
Create task-specific metrics
Human evaluation protocols
Regression testing
Set baselines and track progress

Agent Benchmarking6

Understand the GAIA benchmark
Implement ToolBench evaluation
Use AgentBench
Design domain-specific benchmarks
Cross-model performance comparison
Build benchmark dashboards

Tracing & Observability6

Understand distributed tracing
Add tags and metadata
Context propagation
Build feedback collection
Integrate with Langfuse
Trace visualization

Tool Use Debugging6

Tool selection failures and solutions
Argument validation systems
Build tool use dashboards and visualization
Schema mismatch detection
Tool call replay
Interactive tool debugger

Serving Agents (FastAPI)7

Async endpoints, request validation, error handling
Server-Sent Events (SSE) streaming
Background tasks
Design request and response schemas
Authentication — API keys, middleware, errors
OpenAPI metadata and documentation
FastAPI + LangGraph + uvicorn deployment

Podman & Containerization for K8s5

Build optimized container images
Container health checks
Advanced image optimization
Security best practices
Build multi-container agent pods

Production Databases (Postgres/Redis)6

Async PostgreSQL configuration
Connection pool management
Redis caching for LLM responses
Database migrations for agent stacks
Backup and disaster recovery
Monitoring database health

Scaling & Load Balancing7

Stateless service design
Session externalization
Load balancing algorithms
Scaling metrics for LLM workloads
Horizontal Pod Autoscaler configuration
Load testing your scaling design
Rate limiting at the load balancer

Multi-Tenant Agents6

Tenant context middleware
Database-level tenant isolation
Tenant-specific rate limiting and quotas
Per-tenant configuration templates
Usage metering for billing and SLA
Enforcing tenant data segregation at the API

Kubernetes (K8s) Basics8

Creating Kubernetes Deployments
Resource management for LLM workloads
Kubernetes Services
ConfigMaps and Secrets
Rolling updates and CronJobs
Cluster planning and scheduling
Deployment planning synthesis
NetworkPolicies and prompt-injection defense at the edge

CI/CD for Agents7

GitHub Actions for agent testing
Agent evaluation scripts in CI
Kubernetes deployment pipeline
GitOps deployment pattern
Quality gates and pipeline optimization
Rollback mechanisms
Pipeline observability and notifications

Monitoring & Alerting7

Prometheus metrics for agents
Grafana dashboards
Alerting configuration
SLOs and SLIs
Runbook creation for agent incidents
Tracking business KPIs for agent platforms
Agent-specific monitoring patterns (RED, USE, golden signals)

Model Routing & Fallbacks7

Cost-optimized routing
Latency-optimized routing
Building the resilient LLM client
Provider health checking
Cost tracking and optimization
Capability-based routing
Multi-model routing inside LangGraph

Long-Running Agents7

Cross-session persistence
Checkpoint serialization
Workflow resumption
Task queue integration with Celery
Progress tracking and SSE streaming
Timeout handling and graceful shutdown
Long-running agents with CrewAI — synthesis

Production Architecture Patterns7

System components, interfaces, and integration points
Cost modeling and projection
Production checklists and audit
Architecture Decision Records (ADRs)
Disaster recovery planning
System component diagrams
Architecture pattern evaluation — synthesis

Alternative Frameworks (CrewAI/AutoGen)6

CrewAI core concepts and agent personas
AutoGen conversational architecture
Framework comparison and integration patterns
Framework migration strategies and validation
Hybrid multi-framework systems
Picking a framework for a real use case

The A2A Protocol8

Agent cards and message formats
Agent registry and capability-based discovery
Secure message channels, authentication, and trust
Task delegation protocol
Agent network topology and routing
Asynchronous response callbacks
A2A protocol v0.3 features
Bridging LangGraph agents over A2A

Deep Memory (GraphRAG)7

Knowledge graph fundamentals
Graph traversal patterns
Entity extraction with LLMs
Hybrid retrieval strategies (vector + graph)
Entity resolution and de-duplication
Incremental graph updates and provenance
Graph export, embeddings, and summarization

Advanced Simulation7

The simulation environment interface
Multi-agent simulation patterns
The village simulation
Observing emergent behavior
Designing simulation experiments
Simulation visualization
Metrics, anomaly detection, and simulation scaling

The Privacy Specialist6

Local model deployment fundamentals
Calling Ollama from Python
Building private RAG systems
Data residency patterns
Inference optimization for local Ollama
Hybrid routing — local vs cloud per request

The Vision Agent7

Sending images to vision models
Parsing screenshots for GUI automation
Processing video frame-by-frame
Handling multimodal context
Vision tools — OCR, describe, compare
Image processing pipelines and optimization
Multi-step vision workflow orchestration

Computer Use Agents8

Anthropic Computer Use API
Screen coordinate systems
Action verification loops
Designing safe automation workflows
Action planning and recovery
Workflow execution and testing
Desktop tool registry
End-to-end CU agent pipeline

Voice & Audio Agents7

Using the OpenAI Realtime API
Building speech-to-speech agents
Handling audio streaming and buffers
Managing interruptions
Voice-callable tools
Voice pipeline latency profiling
End-to-end voice processing pipeline

Code Agents7

Parsing code with AST and building a code index
Code modification and safe execution
Git integration and code review
Test generation
CI integration — interpreting test results
Language Server Protocol (LSP) integration
Code agent orchestration — synthesis

Autonomous Agent Workflows5

Hierarchical goal decomposition and dynamic planning
Self-correction mechanisms
Reflection, learning, and Q-learning policies
Meta-agent orchestration
Feedback loops and runaway prevention

Streaming Data for Agents4

Stream ingestion fundamentals
LLM integration with streams and semantic caching
Backpressure, flow control, and circuit breaker
Reactive event processing and anomaly detection

Agent Swarms & Collaboration4

Swarm architecture patterns
Agent communication and pub/sub
Consensus and weighted voting
Emergent behavior and stigmergy

Agent Evaluation Pipelines4

Evaluation dataset creation and curation
Automated evaluators and advanced patterns
Continuous evaluation pipelines and production monitoring
Statistical analysis and reporting

Pre/Post Processing Pipelines4

Preprocessing pipeline
Postprocessing pipeline and output formatting
Async optimization, monitoring, and testing
ETL patterns for agent data

MCP Advanced Ecosystem5

Codebase navigation, grep, and contextual file reading
The complete code agent loop
Production safety measures
MCP tool registries
MCP server caching for performance

Agent Trajectory Evaluation6

Trajectory evaluation fundamentals and scoring rubrics
LLM-as-judge pipelines
Golden trajectory datasets
Version comparison and A/B evaluation
Evaluation-gated CI/CD
Trajectory evaluation capstone

Agent Safety Boundaries6

Tool permission systems
Resource budget limiters
Kill switch mechanisms
Sandbox isolation
Safety monitoring and escalation
Safety boundaries integration — capstone

Agent Cost Controller6

Token accounting systems
Multi-agent cost attribution
Cost-aware model routing
Agent cost dashboards
Budget alerts and auto-downgrade
Cost control integration — capstone

Enterprise Agent Patterns6

Document processing agents
Customer service triage agents
Code review agents
Production retry and escalation
Enterprise audit logging
Enterprise capstone — three agents on shared infra

Agent Load Testing6

Load test frameworks and traffic patterns
Tool contention analysis and mitigation
Memory profiling and leak detection
Latency breakdown and waterfall analysis
Capacity planning methodology
Load testing capstone — integrated stress framework

Agent Versioning and Rollback6

Agent version schema design
Canary deployment with traffic splitting
Automated rollback on eval score drops
Version diff and history
Multi-team version management
Versioning capstone — unified platform service

Agent Fleet Dashboard6

Fleet metrics collector and health score
Execution trace storage and search
Cost aggregation and drill-down
Anomaly baseline profiler and alert pipeline
Unified dashboard layout and operational controls
Fleet dashboard capstone

Autonomous Agent Governance6

Immutable audit trails
Decision logging framework
Human escalation engine
Compliance report generator
Governance middleware
Governance capstone

GenAI Agent Engineering

Verifiable skill graph

What you'll ship in production

Design autonomous GenAI agents

Build multi-agent systems

Implement MCP servers and clients

Enable agent-to-agent communication

Build production RAG agents

Implement guardrails and safety controls

Evaluate agent performance

Design context engineering

Curriculum