Building Safe AI Agents with Guardrails
Build safe AI agents with input and output guardrails. Learn how to prevent inappropriate responses, enforce policies, and maintain academic integrity.
Timestamps
Click any timestamp to jump to that moment in the video
Setting up the environment and understanding OpenAI Agents SDK requirements
Introduction to guardrails as safety checks for AI agents. Understanding their purpose and importance.
Creating the base FAQ assistant with search capabilities using course data
Building the agent system using OpenAI Agents SDK with tool integration
Implementing input guardrails to block irrelevant or harmful queries before the agent processes them
Adding output guardrails to validate responses before users see them, preventing policy violations
Understanding how guardrails analyze context and metadata to make decisions
Building framework-agnostic guardrail implementation using tool calls for frameworks without native support
Optimizing guardrail execution by running multiple guardrails concurrently with asyncio to avoid added latency
Recap of key concepts, implementation patterns, and information about future workshops
Core Tools
What You'll Learn
- Defining guardrails as LLM-based safety checks
- Implementing input guardrails to block irrelevant or harmful queries
- Implementing output guardrails to validate responses
- Preventing inappropriate promises like deadline extensions or legal advice
- Enforcing academic integrity by blocking homework-writing
- Chaining multiple guardrails with early stop behavior
- Running guardrails with streaming safely
- Implementing tool-based guardrails for frameworks without native support
- Using asyncio to run guardrails concurrently
- Cancelling the agent early when a guardrail trips
- Building a framework-agnostic DIY guardrail runner
Expected Outcome
A DataTalks.Club FAQ assistant protected by input and output guardrails that block off-topic questions, unsafe or policy-violating responses, and academic dishonesty, supports multiple guardrails with clear failure handling, works with streaming, and includes a reusable async pattern to add guardrails to any agent framework