Tool-based guardrails run the check first and the agent second. That adds latency. To avoid that, we can run the guardrail and the agent alongside each other, the same way framework guardrails usually run under the hood.

Mock a slow agent

Start with a fake agent that sleeps for two seconds:

import asyncio

async def mock_agent(input: str) -> str:
    """Simulates an agent that takes time to process."""
    print(f"[Agent] Starting work on: {input}")
    await asyncio.sleep(2)
    print("[Agent] Done!")
    return f"Response to: {input}"

Run it once:

result = await mock_agent("hello")
print(result)

You should see the start message, the done message, and the returned response.

Mock a passing guardrail

Create a guardrail that sleeps for one second and passes:

async def mock_guardrail(input: str) -> str:
    """Simulates a guardrail that takes time to process."""
    print(f"[Guardrail] Starting work on: {input}")
    await asyncio.sleep(1)
    print("[Guardrail] Good!")
    return f"Response to: {input}"

Run it by itself:

result_g = await mock_guardrail("hello")
print(result_g)

Now we have two async functions with different durations, which is enough to show the difference between sequential and parallel execution.

Sequential execution

Run the guardrail first and the agent second:

guardrail_result = await mock_guardrail("hello")
agent_result = await mock_agent("hello")

The total wait is roughly three seconds because the sleeps happen one after the other. That is what the tool-based guardrail pattern does when the agent checks the topic before answering.

Concurrent execution

Run both coroutines with asyncio.gather:

results = await asyncio.gather(
    mock_guardrail("hello"),
    mock_agent("hello"),
)

The total wait is roughly two seconds, because both sleeps start together and the slower coroutine determines the total duration.

This is why parallel guardrails are faster. If the guardrail passes, the user does not pay the full guardrail time on top of the agent time.

Tasks and cancellation

To cancel a coroutine, wrap it in a task with asyncio.create_task:

agent_task = asyncio.create_task(mock_agent("hello"))
guard_task = asyncio.create_task(mock_guardrail("hello"))

Then wait for both:

await asyncio.gather(agent_task, guard_task)

After they finish, you can read the agent result:

print(agent_task.result())

Use tasks because cancellation works on tasks. The next step creates a failing guardrail and cancels the agent task when the guardrail raises.

Continue with Part 10: Cancel on tripwire.

Part 9: Async primer

Mock a slow agent

Mock a passing guardrail

Sequential execution

Concurrent execution

Tasks and cancellation

Questions & Answers (0)