Part 5: Infer tool schemas from Python functions
So far we define the OpenAI function schema by hand. Agent frameworks can usually infer that schema from type hints and docstrings. That reduces duplication: the function and its description stay together.
Add type hints and a docstring to search:
from typing import Any, Dict, List
def search(query: str) -> List[Dict[str, Any]]:
"""
Search the FAQ database for entries matching the given query.
Args:
query (str): Search query text to look up in the course FAQ.
Returns:
List[Dict[str, Any]]: A list of search result entries, each containing relevant metadata.
"""
boost = {"question": 3.0, "section": 0.5}
results = index.search(
query=query,
filter_dict={"course": "data-engineering-zoomcamp"},
boost_dict=boost,
num_results=5,
output_ids=True
)
return results
Now register the function without passing search_tool:
agent_tools = Tools()
agent_tools.add_tool(search)
agent_tools.get_tools()
agent_tools.get_tools() returns the schema that will be sent to the
model. In practice, it plays the same role as the earlier search_tool
dict.
Add a tool that changes the index
Plain RAG retrieves information and uses it to answer. Agents can also perform actions. Add a second tool that appends a new FAQ entry to the in-memory index:
def add_entry(question: str, answer: str) -> None:
"""
Add a new entry to the FAQ database.
Args:
question (str): The question to be added to the FAQ database.
answer (str): The corresponding answer to the question.
"""
doc = {
"question": question,
"text": answer,
"section": "user added",
"course": "data-engineering-zoomcamp"
}
index.append(doc)
Register both tools:
agent_tools = Tools()
agent_tools.add_tool(search)
agent_tools.add_tool(add_entry)
Create the runner again with the expanded tool list:
runner = OpenAIResponsesRunner(
tools=agent_tools,
developer_prompt=developer_prompt,
chat_interface=chat_interface,
llm_client=OpenAIClient()
)
Try this sequence in the interactive chat:
How do I do well in module 1?Add this back to FAQ
The model first searches, answers, and then uses add_entry when you ask
it to save the answer.
Check the index after the tool call:
index.docs[-1]
Because minsearch is in memory, this addition disappears when the notebook
process stops. Here we care about the action pattern, not durable storage.
Move tool state into a class
The two functions above rely on a global index. That is fine for
learning the loop, but it hides a dependency. A class makes the
dependency explicit and easier to reuse later in the MCP server.
class SearchTools:
def __init__(self, index):
self.index = index
def search(self, query: str) -> List[Dict[str, Any]]:
"""
Search the FAQ database for entries matching the given query.
Args:
query (str): Search query text to look up in the course FAQ.
Returns:
List[Dict[str, Any]]: A list of search result entries, each containing relevant metadata.
"""
boost = {"question": 3.0, "section": 0.5}
results = self.index.search(
query=query,
filter_dict={"course": "data-engineering-zoomcamp"},
boost_dict=boost,
num_results=5,
output_ids=True
)
return results
def add_entry(self, question: str, answer: str) -> None:
"""
Add a new entry to the FAQ database.
Args:
question (str): The question to be added to the FAQ database.
answer (str): The corresponding answer to the question.
"""
doc = {
"question": question,
"text": answer,
"section": "user added",
"course": "data-engineering-zoomcamp"
}
self.index.append(doc)
Create an instance and register all public methods:
search_tools = SearchTools(index)
agent_tools = Tools()
agent_tools.add_tools(search_tools)
agent_tools.get_tools()
This gives the agent the same two tools, search and add_entry, but
the implementation no longer depends on notebook globals. The MCP server
will reuse this shape.