Back to Events
Build a Production-Ready YouTube AI Agent with Temporal
Completed

Build a Production-Ready YouTube AI Agent with Temporal

December 16, 2025, 01:00 Europe/Berlin
Alexey Grigorev

Continue with the workshop writeup

Open the canonical pages, recording, materials, and code repo.

View workshop writeup

We build a deep research agent over the DataTalks.Club podcast archive. The workshop starts by downloading and indexing YouTube transcripts, then turns that ingestion code into a durable Temporal workflow. After the data is searchable, we build a Pydantic AI research agent, add a summarization sub-agent for long transcripts, and wrap the agent run in Temporal too.

Links

The main resources for this workshop:

The system you will build

The final system looks like this:

flowchart LR YT["YouTube transcripts"] CACHE["Cached transcript files<br/>GitHub fallback"] FLOW["flow/<br/>Temporal ingestion"] ES["Elasticsearch<br/>podcasts index"] AGENT["agent/<br/>Pydantic AI research agent"] SUM["Summarization sub-agent"] TW["Temporal workflow<br/>agent run"] OPENAI["OpenAI"] YT -->|fetch_subtitles| FLOW CACHE -->|fetch_transcript_cached| FLOW FLOW -->|index_video| ES AGENT -->|search_videos| ES AGENT -->|summarize| SUM AGENT -->|model call| OPENAI SUM -->|model call| OPENAI TW -->|durable execution| AGENT

The ingestion side has the parts that usually fail in production: network calls, YouTube blocking cloud IPs, proxies, Elasticsearch writes, and long loops over many videos. Temporal gives that side retries, observability, and durable execution. The agent side uses the indexed data to answer questions from the podcast archive and then uses Temporal again so long agent runs can survive failures.