This workshop builds the full path from transcript ingestion to a durable research agent, but a few production improvements are intentionally left for later. These are good next steps.

Parallel ingestion

The workflow processes videos one by one:

for video in videos:
    ...

That is readable and easy to debug. It is also slower than it needs to be. Batch podcast videos in groups of five or ten, then execute those batches in parallel. Temporal can handle that shape, but the workshop keeps the first version simple.

Cached fallback inside the workflow

Part 1: Fetch one transcript shows fetch_transcript_cached, and Part 3: Discover podcast videos shows direct YouTube fetching with proxy support. The workflow in this walkthrough uses the direct fetch path.

A production version should decide the fallback behavior explicitly:

try YouTube directly,
retry through a proxy,
fall back to cached transcript files when available,
record which path produced each transcript.

That marker helps later when debugging missing or stale transcripts.

Stronger agent instructions

The research agent sometimes searched but did not call summarize when we wanted it to. More precise tool descriptions and instructions would help. One option is to state that after every promising search_videos result, the agent must call summarize(video_id) before writing the final answer.

Tool docstrings matter here. Pydantic AI exposes them to the model, so the docstring should say when to use the tool, what the arguments mean, and what the returned text represents.

Structured references

The final answer still needs stronger references. Instructions ask for citations, but prose instructions alone are not enough.

Use structured output for the final answer. For example, require the model to return a list of sections, each with paragraphs and source objects:

from pydantic import BaseModel

class SourceRef(BaseModel):
    video_id: str
    timestamp: str
    quote: str

class Paragraph(BaseModel):
    text: str
    sources: list[SourceRef]

class Section(BaseModel):
    title: str
    paragraphs: list[Paragraph]

Then render that structure to Markdown. This makes it much harder for the agent to omit references silently.

Tests

This walkthrough does not add tests. Add focused tests around:

timestamp formatting,
cached transcript parsing,
podcast YAML filtering,
skip-list behavior,
Elasticsearch search response shaping,
summarization prompt construction,
run-context serialization for Temporal.

Tests around Temporal itself can be thinner at first. The highest risk is in the transformation code and the serialization adapter.

Production configuration

The workshop uses local defaults:

Elasticsearch("http://localhost:9200")
Client.connect("localhost:7233")

Move these into environment variables for deployment:

ELASTICSEARCH_ADDRESS=...
TEMPORAL_ADDRESS=...
OPENAI_API_KEY=...

The ingestion code already reads ELASTICSEARCH_ADDRESS in ElasticsearchActivities. The agent code should follow the same pattern.

Deployment

This workshop runs locally. A production deployment would need:

a managed or self-hosted Elasticsearch cluster,
a Temporal server or Temporal Cloud namespace,
separate workers for ingestion and agent workflows,
secret management for proxy and OpenAI credentials,
monitoring for failed activities and long retries.

Do the local version first. Deployment is much easier when the activity boundaries are already clean.

What to improve next