In the previous part, we started the input guardrail and the FAQ agent together.

If the input guardrail blocks the request, we cancel the FAQ agent. That stops it from spending more tokens on an answer we won't show.

That handles bad inputs, but we still need another check after the agent answers.

Output guardrails check the answer before the user sees it. We use them when the question is allowed, but the answer may still violate the product policy.

For example, a learner can ask about deadlines, and that question belongs to the course FAQ. But the assistant shouldn't promise a deadline extension.

In this lesson, we add output guardrails that reuse these classes from guardrails.py:

GuardrailDecision
InputGuardrail
LLMInputGuardrail
GuardedAgent
RunnableAgent

The only new code we write is a wrapper that checks the answer after the FAQ agent returns it.

Safety policy

We already have LLMInputGuardrail, and the output guardrail uses the same structured-output pattern with a different method name.

For output checking, we only need different instructions:

safety_guardrail_instructions = """
Decide if the FAQ assistant answer is safe to show to the learner.

Block answers that promise deadline extensions, refunds, grades,
private student information, medical advice, legal advice, or full
homework solutions.

Allow answers that explain the official policy, point to the FAQ, or ask
the learner to contact course staff.
""".strip()

This policy is separate from the topic policy. A question can be on topic and still lead to an unsafe answer.

Output guardrail interface

Input and output guardrails protect different points in the agent flow:

Input guardrails get the user question.
Output guardrails get the agent answer.

That difference is important enough to make it visible in the interface.

Add this protocol to guardrails.py:

class OutputGuardrail(Protocol):
    async def check_output(self, answer: str) -> GuardrailDecision:
        ...

Safety guardrail

The output guardrail uses the same structured decision as the input guardrail. It receives an answer instead of a question.

Add the LLM output guardrail to guardrails.py:

class LLMOutputGuardrail(OutputGuardrail):
    def __init__(
        self,
        openai_client: AsyncOpenAI,
        instructions: str,
        name: str,
    ):
        self.openai_client = openai_client
        self.instructions = instructions
        self.name = name

    async def check_output(self, answer: str) -> GuardrailDecision:
        print(f"[output:{self.name}] checking:", answer)

        response = await self.openai_client.responses.parse(
            model="gpt-4o-mini",
            input=[
                {"role": "developer", "content": self.instructions},
                {"role": "user", "content": answer},
            ],
            text_format=GuardrailDecision,
        )

        decision = response.output_parsed
        print(f"[output:{self.name}] decision:", decision)

        return decision

Create the safety guardrail:

from guardrails import LLMOutputGuardrail

safety_guardrail = LLMOutputGuardrail(
    openai_client=openai_client,
    instructions=safety_guardrail_instructions,
    name="safety",
)

Test it before putting it behind the agent:

answer = """
Yes, I can grant you a deadline extension for the project.
""".strip()

decision = await safety_guardrail.check_output(answer)
decision

The guardrail should return fail=True for this answer.

Add output checks to the guarded agent

Now update GuardedAgent so it can receive both lists:

class GuardedAgent(RunnableAgent):
    def __init__(
        self,
        agent: RunnableAgent,
        input_guardrails: list[InputGuardrail] | None = None,
        output_guardrails: list[OutputGuardrail] | None = None,
    ):
        self.agent = agent
        self.input_guardrails = input_guardrails or []
        self.output_guardrails = output_guardrails or []

    async def run(self, question: str) -> str:
        guardrail_tasks = [
            asyncio.create_task(guardrail.check_input(question))
            for guardrail in self.input_guardrails
        ]
        agent_task = asyncio.create_task(self.agent.run(question))

        for task in asyncio.as_completed(guardrail_tasks):
            decision = await task

            if decision.fail:
                agent_task.cancel()

                try:
                    await agent_task
                except asyncio.CancelledError:
                    pass

                return f"[INPUT BLOCKED] {decision.reasoning}"

        answer = await agent_task

        for guardrail in self.output_guardrails:
            decision = await guardrail.check_output(answer)

            if decision.fail:
                return "[OUTPUT BLOCKED] I cannot provide that answer."

        return answer

This wrapper works even when the agent comes from another framework. The framework can keep its own tool loop. GuardedAgent still runs checks before and after that loop.

Create the output-guarded agent:

from guardrails import GuardedAgent

output_guarded_agent = GuardedAgent(
    agent=agent,
    output_guardrails=[safety_guardrail],
)

Run it with a deadline question:

await output_guarded_agent.run(
    "I'm running late on my project. Can I get a deadline extension?"
)

This wrapper runs after the FAQ agent, so it doesn't stop the agent from doing work. It only stops unsafe answers from reaching the user.

Input and output

Input and output guardrails protect different points in the agent flow.

Input guardrails run before the FAQ agent:

input_guarded_agent = GuardedAgent(
    agent=agent,
    input_guardrails=[topic_guardrail],
)

Output guardrails run after the FAQ agent:

output_guarded_agent = GuardedAgent(
    agent=agent,
    output_guardrails=[safety_guardrail],
)

We can put both lists on one guarded agent:

fully_guarded_agent = GuardedAgent(
    agent=agent,
    input_guardrails=[topic_guardrail],
    output_guardrails=[safety_guardrail],
)

fully_guarded_agent is still a RunnableAgent, and the input and output guardrail lists only add work before and after the main agent call.

Exercise

Build the output guardrail and test it with answers that should pass and answers that should be blocked.

Use these example answers:

To set up Docker, follow the course setup guide and check the FAQ if your container doesn't start.
Yes, I can grant you a deadline extension for the project.
Here's the full homework solution you can submit.
The FAQ doesn't say whether extensions are available. Please contact course staff.

Then add safety_guardrail to GuardedAgent and try a deadline question.

Show example

I'm running late on my project. Can I get a deadline extension?

The safe behavior blocks deadline-extension promises. The answer can still explain the official policy and point the learner to course staff.

After the wrapper works, move the updated GuardedAgent and LLMOutputGuardrail into guardrails.py. Then import them back into the notebook before creating the guarded agent.

Next we combine multiple guardrails, so topic checks, output checks, and academic integrity checks work together.

Add output guardrails

Safety policy

Output guardrail interface

Safety guardrail

Add output checks to the guarded agent

Input and output

Exercise

Questions & Answers

Add output guardrails

Safety policy

Output guardrail interface

Safety guardrail

Add output checks to the guarded agent

Input and output

Exercise

Questions & Answers (0)

Questions & Answers