Building Safe AI Agents with Guardrails
Full workshop writeup
Pages, code repo, and split-level access for Main and above.
We start with a DataTalks.Club Data Engineering Zoomcamp FAQ assistant,
then add checks that keep the agent on topic, block unsafe responses, and
show how to cancel wasted work when a guardrail fails. The workshop uses
the OpenAI Agents SDK for built-in guardrails, then rebuilds the same
idea with tools and plain asyncio so you can use it with other agent
frameworks.
Links
The external resources:
- Workshop recording
- Workshop code
- Related course: AI Bootcamp: From RAG to Agents
- FAQ data used by the agent
- AI Hero email course for the docs.py loader
- OpenAI Agents SDK guardrails documentation
The notebook you will build
The final notebook has guardrails around a tool-using FAQ agent:
The base agent can already search the FAQ, but it tries to answer unrelated questions too. The input guardrail blocks questions outside the course domain. The output guardrail checks the agent response for policy problems such as promising deadline extensions or writing homework for a student. The later parts show the same checks as tools and as a small async runner that can cancel work when a guardrail trips.
Walkthrough
Follow the numbered files in order. Each file is one self-contained step.
- Overview and setup - prerequisites, environment setup, and where guardrails run.
- Part 1: Load FAQ documents - download
docs.py, read the DataTalks.Club FAQ from GitHub, and look at the loader shape. - Part 2: Base FAQ agent - index the FAQ with
minsearch, exposesearch_faq, create the FAQ assistant, and show the off-topic failure mode. - Part 3: Input guardrail - build a topic classifier with structured output and attach it as an SDK input guardrail.
- Part 4: Tripwire handling - catch input tripwire exceptions and turn blocked requests into user-facing messages.
- Part 5: Output guardrail - add an output guardrail for deadline extensions, legal advice, personal information, offensive language, and homework policy.
- Part 6: Multiple guardrails - chain more than one output guardrail with an academic-integrity check.
- Part 7: Streaming with guardrails - run the fully guarded agent in streaming mode and handle blocked streaming calls.
- Part 8: Tool-based guardrails - express the topic check as a normal tool for frameworks without guardrail support.
- Q&A: side discussions - additional implementation questions: framework choice, input size checks, guardrail function parameters, multiple guardrails, tool reliability, mock guardrails, and parallel call cost.
- Part 9: Async primer - compare sequential and concurrent
guardrail execution with
asyncio.gather. - Part 10: Cancel on tripwire - turn a failing guardrail into an exception and cancel the agent task.
- Part 11: DIY runner for the FAQ agent - build a guardrail runner that works across frameworks and connect it back to the FAQ agent.
- Summary and next steps - takeaways, when to use each guardrail type, and links to learn more.