Part 7: Summarize long transcripts
The research agent can search and fetch full subtitles, but long transcripts can exhaust the model context window. Instead of passing a whole episode into the main research agent, we add a second agent whose only job is to summarize one transcript for the current user query and search history.
Create the summarization instructions:
summarization_instructions = """
Your task is to summarize the provided YouTube transcript for a specific topic.
Select the parts of the transcripts that are relevant for the topic and search queries.
Format:
paragraph with discussion (timestamp)
""".strip()
Create the sub-agent:
from pydantic_ai import Agent
summarization_agent = Agent(
name='summarization',
instructions=summarization_instructions,
model='openai:gpt-4o-mini'
)
Test it directly before turning it into a tool. This test gives the summarizer the user query, the search queries, and one full transcript:
user_query = 'how do I get rich with AI?'
search_queries = [
"investment opportunities in AI",
"starting AI-focused businesses",
"AI applications in wealth generation"
]
subtitles = get_subtitles_by_id('1aMuynlLM3o')['subtitles']
Build the prompt:
prompt = f"""
user query:
{user_query}
search engine queries:
{'\n'.join(search_queries)}
subtitles:
{subtitles}
""".strip()
summary_result = await summarization_agent.run(prompt)
print(summary_result.output)
Now the sub-agent has enough context to summarize the transcript for the topic the user asked about, not as a generic episode summary.
Turn summarization into a tool
The summarize tool needs access to the current run context. It reads the
original user prompt and the previous search_videos tool calls, fetches the
full subtitles, and asks the summarization agent to summarize only the
relevant parts.
Start with imports:
import json
import textwrap
from pydantic_ai import RunContext
Define the tool:
async def summarize(ctx: RunContext, video_id: str) -> str:
"""
Generate a summary for a video based on the conversation history,
search queries, and the video's subtitles.
"""
user_queries = []
search_queries = []
Extract user prompts and search queries from the message history:
for m in ctx.messages:
for p in m.parts:
kind = p.part_kind
if kind == 'user-prompt':
user_queries.append(p.content)
if kind == 'tool-call':
if p.tool_name == 'search_videos':
args = json.loads(p.args)
query = args['query']
search_queries.append(query)
Fetch subtitles and create the summarization prompt:
subtitles = get_subtitles_by_id(video_id)['subtitles']
prompt = textwrap.dedent(f"""
user query:
{'\n'.join(user_queries)}
search engine queries:
{'\n'.join(search_queries)}
subtitles:
{subtitles}
""").strip()
summary_result = await summarization_agent.run(prompt)
return summary_result.output
Replace get_subtitles_by_id with summarize in the main agent tools:
research_agent = Agent(
name='research_agent',
instructions=research_instructions,
model='openai:gpt-4o-mini',
tools=[search_videos, summarize]
)
Run it with the same callback:
result = await research_agent.run(
user_prompt='how do I get rich with AI?',
event_stream_handler=research_agent_callback
)
print(result.output)
If the model does not call summarize when you expect it to, treat that as an
instruction design issue, not an Elasticsearch issue. Stronger tool
instructions and structured output are good follow-up work.
Move tools into modules
Convert the notebook into a script:
uv run jupyter nbconvert --to=script agent.ipynb
Create tools.py. Wrap the search functions in a class so the Elasticsearch
client and index name are dependencies:
import json
import textwrap
from pydantic_ai import Agent, RunContext
from elasticsearch import Elasticsearch
class SearchTools:
def __init__(self, es_client: Elasticsearch, index_name: str):
self.es_client = es_client
self.index_name = index_name
The search_videos method is the same Elasticsearch query from the notebook:
def search_videos(self, query: str, size: int = 5) -> list[dict]:
body = {
"size": size,
"query": {
"multi_match": {
"query": query,
"fields": ["title^3", "subtitles"],
"type": "best_fields",
"analyzer": "english_with_stop_and_stem"
}
},
"highlight": {
"pre_tags": ["*"],
"post_tags": ["*"],
"fields": {
"title": {"fragment_size": 150, "number_of_fragments": 1},
"subtitles": {"fragment_size": 150, "number_of_fragments": 1}
}
}
}
Search and return snippets:
response = self.es_client.search(index=self.index_name, body=body)
hits = response.body['hits']['hits']
results = []
for hit in hits:
highlight = hit['highlight']
highlight['video_id'] = hit['_id']
results.append(highlight)
return results
The full transcript retrieval stays in the same class:
def get_subtitles_by_id(self, video_id: str) -> dict:
result = self.es_client.get(index=self.index_name, id=video_id)
return result['_source']
Now create SummarizationTools, which depends on both SearchTools and the
summarization agent:
class SummarizationTools:
def __init__(
self,
search_tools: SearchTools,
summarization_agent: Agent
):
self.search_tools = search_tools
self.summarization_agent = summarization_agent
Its summarize method is the tool version from the notebook:
async def summarize(self, ctx: RunContext, video_id: str) -> str:
user_queries = []
search_queries = []
for m in ctx.messages:
for p in m.parts:
kind = p.part_kind
if kind == 'user-prompt':
user_queries.append(p.content)
if kind == 'tool-call' and p.tool_name == 'search_videos':
args = json.loads(p.args)
search_queries.append(args['query'])
Finish the method with the prompt and sub-agent call:
subtitles = self.search_tools.get_subtitles_by_id(video_id)['subtitles']
prompt = textwrap.dedent(f"""
user query:
{'\n'.join(user_queries)}
search engine queries:
{'\n'.join(search_queries)}
subtitles:
{subtitles}
""").strip()
summary_result = await self.summarization_agent.run(prompt)
return summary_result.output
Now we have a clean non-Temporal agent. The last implementation step wraps it in Temporal.