What to improve next
This workshop builds the full path from transcript ingestion to a durable research agent, but a few production improvements are intentionally left for later. These are good next steps.
Parallel ingestion
The workflow processes videos one by one:
for video in videos:
...
That is readable and easy to debug. It is also slower than it needs to be. Batch podcast videos in groups of five or ten, then execute those batches in parallel. Temporal can handle that shape, but the workshop keeps the first version simple.
Cached fallback inside the workflow
Part 1: Fetch one transcript shows
fetch_transcript_cached, and Part 3: Discover podcast videos
shows direct YouTube fetching with proxy support. The workflow in this
walkthrough uses the direct fetch path.
A production version should decide the fallback behavior explicitly:
- try YouTube directly,
- retry through a proxy,
- fall back to cached transcript files when available,
- record which path produced each transcript.
That marker helps later when debugging missing or stale transcripts.
Stronger agent instructions
The research agent sometimes searched but did not call summarize when we
wanted it to. More precise tool descriptions and instructions would help.
One option is to state that after every promising search_videos result, the
agent must call summarize(video_id) before writing the final answer.
Tool docstrings matter here. Pydantic AI exposes them to the model, so the docstring should say when to use the tool, what the arguments mean, and what the returned text represents.
Structured references
The final answer still needs stronger references. Instructions ask for citations, but prose instructions alone are not enough.
Use structured output for the final answer. For example, require the model to return a list of sections, each with paragraphs and source objects:
from pydantic import BaseModel
class SourceRef(BaseModel):
video_id: str
timestamp: str
quote: str
class Paragraph(BaseModel):
text: str
sources: list[SourceRef]
class Section(BaseModel):
title: str
paragraphs: list[Paragraph]
Then render that structure to Markdown. This makes it much harder for the agent to omit references silently.
Tests
This walkthrough does not add tests. Add focused tests around:
- timestamp formatting,
- cached transcript parsing,
- podcast YAML filtering,
- skip-list behavior,
- Elasticsearch search response shaping,
- summarization prompt construction,
- run-context serialization for Temporal.
Tests around Temporal itself can be thinner at first. The highest risk is in the transformation code and the serialization adapter.
Production configuration
The workshop uses local defaults:
Elasticsearch("http://localhost:9200")
Client.connect("localhost:7233")
Move these into environment variables for deployment:
ELASTICSEARCH_ADDRESS=...
TEMPORAL_ADDRESS=...
OPENAI_API_KEY=...
The ingestion code already reads ELASTICSEARCH_ADDRESS in
ElasticsearchActivities. The agent code should follow the same pattern.
Deployment
This workshop runs locally. A production deployment would need:
- a managed or self-hosted Elasticsearch cluster,
- a Temporal server or Temporal Cloud namespace,
- separate workers for ingestion and agent workflows,
- secret management for proxy and OpenAI credentials,
- monitoring for failed activities and long retries.
Do the local version first. Deployment is much easier when the activity boundaries are already clean.