Workshops ... Part 8: Tool-based guardrails

Part 8: Tool-based guardrails

Some frameworks do not have built-in input and output guardrails. A simple alternative is to expose the check as a normal tool and instruct the agent to call it before answering.

Now we move from SDK guardrails to a version that works in other frameworks. The guardrail idea should still work even when the framework does not expose first-class guardrail hooks.

Create the guardrail tool

Reuse the existing topic_guardrail_agent:

@function_tool
async def check_topic(query: str) -> TopicGuardrailOutput:
    """Check if the query is appropriate for this course.

    Args:
        query: The user's question to check.

    Returns:
        TopicGuardrailOutput with fail flag and reasoning.
    """
    result = await Runner.run(topic_guardrail_agent, query)
    return result.final_output

Use await Runner.run here because the tool function is async. If you adapt this pattern to a sync tool, keep the call style consistent with the function you expose.

Tell the agent to check first

Extend the FAQ instructions with a rule:

guarded_faq_instructions = faq_instructions + """

IMPORTANT: Before answering any question, use the check_topic tool first.
- If check_topic returns fail=True, respond with the reasoning and stop.
- If check_topic returns fail=False, proceed to search the FAQ and answer.
"""

The all-caps word is part of the model instructions. It marks this as a rule the agent must follow before answering.

Create an agent with both tools:

faq_agent_with_guardrail = Agent(
    name="faq_assistant",
    instructions=guarded_faq_instructions,
    tools=[check_topic, search_faq],
    model="gpt-4o-mini",
)

The model is expected to call check_topic first, then either stop or call search_faq.

Test the tool-based guardrail

Run a course question:

result = await Runner.run(faq_agent_with_guardrail, "How do I submit homework?")
print(result.final_output)

Then run an unrelated question:

result = await Runner.run(faq_agent_with_guardrail, "How can I cook pizza?")
print(result.final_output)

The unrelated question should return a message saying the pizza question is not related to data engineering or course content. That is the desired result: the guardrail tool gives the agent a reason to stop.

Tradeoffs

This pattern works because tools are widely supported:

  • Every agent framework has some tool equivalent.
  • The check is easy to review as a normal tool call.
  • You can reuse the same classifier agent from the SDK guardrail path.

The limits matter too:

  • The agent has to follow the instruction and call check_topic.
  • The check runs before the answer, so the user waits for both calls.
  • A prompt injection can try to convince the agent to skip the check.
  • It is less reliable than a framework-enforced guardrail.

Use this for quick prototypes, simple flows, or frameworks where you cannot enforce a guardrail outside the model. The next files show how to remove the sequential wait with asyncio.

Continue with Q&A: side discussions for additional implementation questions, or go straight to Part 9: Async primer for the async implementation.

Questions & Answers (0)

Sign in to ask questions