On This Page
Prerequisites
-
Working knowledge of Kubernetes Deployments, Services, and Helm chart installation for deploying new services to a cluster
-
Familiarity with Microsoft Presidio analyzer and anonymizer APIs for PII detection and scrubbing
-
Experience with LiteLLM proxy middleware and request/response interceptors
-
Understanding of Prometheus metrics instrumentation including counters, histograms, and label cardinality management
-
Familiarity with FastAPI middleware patterns and request lifecycle hooks
-
Working knowledge of PostgreSQL for structured audit data storage and querying
-
Experience with Pydantic models for request validation and configuration management
-
Understanding of Alertmanager routing and receiver configuration for severity-based alert delivery
-
Basic knowledge of data protection regulations (GDPR, CCPA, HIPAA) and their requirements for PII handling
-
Completion of Chapter 44 (Key Rotation Operator) or equivalent experience with security automation in GenAI platforms
Learning Goals
-
Deploy Microsoft Presidio on K8s for runtime PII detection in LLM traffic
-
Deploy Microsoft Presidio on K8s for runtime PII detection in LLM traffic.You will deploy Presidio analyzer and anonymizer services on your vCluster via Helm charts, configure them for low-latency operation, and build middleware that integrates with LiteLLM to scan every outbound prompt.
-
You will configure bypass rules for trusted internal prompts that intentionally contain example PII, ensuring system prompts are not falsely flagged.
-
The middleware sends text to the Presidio analyzer, receives detected PII entities with confidence scores, and applies per-entity-type handling rules.
-
-
Build PII scrubbing middleware that redacts sensitive data before sending to LLM providers
-
Build PII scrubbing middleware that redacts sensitive data before sending to LLM providers.You will implement the PIIDetectionMiddleware that sits in the LiteLLM request path, intercepts all outbound prompts, and applies Presidio-based scrubbing before the request leaves your infrastructure.
-
You will build configurable scrubbing policies that allow different behaviors per service, use case, and entity type, providing the flexibility that production deployments require.
-
The middleware handles per-entity-type actions (replace, hash, block), tracks detection metrics by entity type, and adds minimal latency to the request path.
-
-
Implement PII detection alerting and audit logging for compliance
-
Implement PII detection alerting and audit logging for compliance.Every PII detection event must be logged without exposing the detected PII values.
-
You will build audit logging to PostgreSQL that records request identifiers, entity types, confidence scores, and actions taken while never storing the actual sensitive data.
-
Every PII detection event must be logged without exposing the detected PII values.
-
Alerting rules fire on PII rate spikes, high-confidence detections of critical entity types (SSNs, credit cards), and PII detected in services that should never handle sensitive data.
-
-
Create PII detection tuning workflows to reduce false positives
-
Create PII detection tuning workflows to reduce false positives.You will build a feedback loop where operators flag false positive detections, the system analyzes false positive patterns, and threshold adjustments are A/B tested on a percentage of traffic before full deployment.
-
You will build a feedback loop where operators flag false positive detections, the system analyzes false positive patterns, and threshold adjustments are A/B tested on a percentage of traffic before full deployment.
-
The tuning workflow tracks false positive rates by entity type, maintains custom deny lists for known non-PII patterns that trigger false positives, and measures the effectiveness of threshold changes against both detection rate and false positive rate.
-