Summary and next steps
The workshop builds the same safety idea in several forms. Start with the SDK guardrails when your framework supports them. Reach for tool-based or DIY guardrails when you need the pattern in a different framework.
Takeaways
The main points:
- Guardrails can be agents. They use LLMs to make pass/fail decisions.
- Input guardrails block irrelevant or harmful input before the main agent sees it.
- Output guardrails catch unsafe responses before the user sees them.
- Tool-based guardrails work in any framework that supports tools.
- Parallel guardrails reduce added latency.
- Cancellation saves tokens when a guardrail trips before the agent finishes.
The FAQ assistant made these concrete. The same base agent could answer course questions, drift into a pizza recipe, or produce a risky policy answer. Each guardrail moved one of those outcomes into an explicit check.
Picking a guardrail type
Use input guardrails for checks on the user message:
- Topic restriction, such as "only answer about this course".
- PII filtering before the agent sees the input.
- Blocking requests that should not start an agent run.
Use output guardrails for checks on the agent response:
- Deadline extensions, refunds, legal advice, or medical advice.
- Offensive language.
- Personal information leakage.
- Response format or style policies.
- Homework or exam answers.
Use both when the application has meaningful user-facing risk. A course assistant is a mild example. Customer support, medical triage, financial advice, and internal tools with private data need stricter versions of the same pattern.
Framework choices
The OpenAI Agents SDK gives you first-class input and output guardrail hooks. That is the cleanest option when you are already using the SDK.
The tool-based pattern is simpler to move across frameworks, but it relies on the agent following the instruction to call the guardrail tool first. Use it when you need something quick or when the framework has no guardrail hook.
The DIY asyncio runner is useful when you have async agent calls and
want cancellation. It takes more code, but it makes how the run works clear:
- Start the agent and guardrails together.
- Return the agent result when all checks pass.
- Cancel unfinished work when a guardrail raises.
- Let the caller decide how to present the blocked reason.
To learn more
Two concrete references are useful after the workshop:
The workshop is part of
AI Bootcamp: From RAG to Agents.
The workshop repo also points to the
AI Hero email course for a deeper
walkthrough of docs.py and the first-agent setup.
The next topic after guardrails is building Skills.md from scratch.