Build a Production-Ready YouTube AI Agent with Temporal
Build a durable data ingestion pipeline, handle IP blocking with proxies, index transcripts into ElasticSearch, and design a multi-stage research agent with Temporal orchestration.
Timestamps
Click any timestamp to jump to that moment in the video
Welcome and overview of building a production-ready YouTube AI agent with Temporal orchestration
Setting up the development environment, comparing GitHub Codespaces and local setup options
Beginning the data ingestion workflow: planning and initial setup
Processing and formatting YouTube transcripts and subtitles for indexing
Setting up ElasticSearch using Docker for search infrastructure
Configuring ElasticSearch indices, mappings, and stop words for optimal search
Building the video iteration logic and implementing progress tracking
Understanding the IP blocking problem when scraping YouTube at scale
Implementing residential proxies to handle IP blocking and rate limiting
Introduction to Temporal for building durable, fault-tolerant workflows
Refactoring ingestion logic into Temporal activities for reliability
Defining the complete ingestion workflow with Temporal workflow definitions
Building the Temporal worker to execute workflows and activities
Starting Part 2: Building the research agent using PydanticAI framework
Setting up agent instructions, prompts, and model configuration
Adding a secondary summarization agent to handle long contexts effectively
Migrating the research agent to use Temporal for durability and reliability
Executing the complete system: durable ingestion + research agent
Reviewing results, key takeaways, and next steps for production deployment
Core Tools
What You'll Learn
- Building a durable data ingestion pipeline
- Handling IP blocking and retries with proxies
- Indexing long-form text into ElasticSearch
- Designing a multi-stage research agent with tool use and summarization
- Orchestrating workflows with Temporal
- Handling retries, state, and recovery in production
- Working with long contexts effectively
Expected Outcome
A production-oriented deep research agent that can answer questions using years of podcast transcripts, backed by a fault-tolerant ingestion workflow