Part 2: Base FAQ agent
Now that the FAQ documents are loaded, we can build the unguarded agent. Start unguarded so you can see what the agent does before guardrails.
Create the search index
Use minsearch for the FAQ index:
from minsearch import Index
faq_index = Index(text_fields=["title", "content", "filename"])
faq_index.fit(faq_documents)
Run a quick search before involving an LLM:
faq_index.search("how do I join the course?")
This separates retrieval debugging from agent debugging. If this search returns reasonable FAQ records, the tool has a good chance of giving the agent useful context.
Turn search into a tool
The OpenAI Agents SDK exposes Python functions to the model with the
@function_tool decorator. The function signature and docstring become
part of the tool description.
from agents import function_tool
from typing import List, Dict
@function_tool
def search_faq(query: str) -> List[Dict]:
"""Search the DataTalks.Club FAQ for relevant answers.
Args:
query: The student's question to search for.
Returns:
List of matching FAQ entries with title and content.
"""
results = faq_index.search(query, num_results=5)
return results
The agent never sees faq_index directly, so it can only call search_faq
with a query string. That boundary is useful because guardrails control
whether the agent gets to run, while tools control what the agent can do
once it runs.
Create the FAQ assistant
The assistant instructions define the role and the expected answer style:
from agents import Agent
faq_instructions = """
You are a helpful teaching assistant for the Data Engineering Zoomcamp.
Your role is to help students by searching the FAQ database for answers to their questions.
When you find relevant FAQs, present them clearly with the title and answer.
If multiple FAQs match, show all of them.
Be friendly and encouraging in your responses.
""".strip()
Create the agent with search_faq:
faq_agent = Agent(
name="faq_assistant",
instructions=faq_instructions,
tools=[search_faq],
model="gpt-4o-mini",
)
We use gpt-4o-mini here, but you can swap in another tool-capable model
as long as you keep the rest of the cells the same.
Run a normal question
Run the agent through Runner.run:
from agents import Runner
result = await Runner.run(faq_agent, "How do I register for the course?")
print(result.final_output)
The answer should come from the FAQ and stay about the Data Engineering Zoomcamp. The exact wording can change with the model and the current FAQ data.
You can look at the SDK items produced during the run:
for item in result.new_items:
print(item)
Use this when you want to confirm that the model called search_faq and see
which query it chose.
You should see a tool call with an argument like register for the course.
The tool output comes back as a separate SDK item, followed by the final
answer. That sequence shows the agent loop in miniature. The model chooses a
tool call, code runs the tool, and the model writes the answer.
Show the off-topic failure mode
Ask an unrelated question:
result = await Runner.run(faq_agent, "How do I cook pizza?")
print(result.final_output)
The agent may first say it can't find a pizza answer in the FAQ. Then it may start writing a basic pizza recipe. That's reasonable generic chatbot behavior, but it's wrong for a course FAQ assistant.
It causes a product problem too. If you put the agent behind a Slack bot or web interface, it shouldn't become a free general-purpose chatbot. You want to pay for course support, not pizza recipes.
The same pattern can show up in other forms:
- A student asks about cooking recipes instead of the course.
- A student asks whether the assistant can promise refunds or deadline extensions.
- A student asks the assistant to write homework for them.
Guardrails give us a place to stop those flows instead of hoping the main assistant prompt covers every case.
Continue with Part 3: Input guardrail to block off-topic questions before the FAQ assistant sees them.