Overview and setup

This workshop starts with a classic RAG pipeline and turns it into an agentic search system. By the end, you will have an LLM agent that searches a real documentation corpus, reads the snippets, opens the documents it needs, and synthesizes an answer - all on its own.

Prerequisites

You need Python and an OpenAI-compatible API key. That's it. If you have used the OpenAI Python client before, you are ready.

Accounts and keys:

  • An OpenAI account with an API key (or an OpenAI-compatible provider like Groq)
  • A GitHub account (for Codespaces, or to clone the repo locally)

Local tools:

  • Python 3.12 or newer
  • uv for dependency management
  • Jupyter Notebook

Environment setup

You can run everything in GitHub Codespaces or on your own machine.

To use Codespaces, open the workshop repo, go to Settings, then Secrets and Variables, then Codespaces, and add OPENAI_API_KEY as a repository secret. Then create a codespace on main.

For a local setup, create a project folder and initialize it:

mkdir agentic-rag
cd agentic-rag
uv init

Add the dependencies:

uv add jupyter openai minsearch gitsource

Start Jupyter:

uv run jupyter notebook

Create a new notebook and verify the OpenAI client works:

from openai import OpenAI

openai_client = OpenAI()

If you use Groq or another OpenAI-compatible provider, point the client at the right base URL:

import os
from openai import OpenAI

openai_client = OpenAI(
    api_key=os.getenv("GROQ_API_KEY"),
    base_url="https://api.groq.com/openai/v1",
)

Set your API key before starting Jupyter. On Codespaces it comes from the repository secret. Locally, export it in the shell:

export OPENAI_API_KEY="sk-..."

Workshop outline

The workshop has four parts:

  1. Classic RAG - build a search-then-generate pipeline over the Evidently AI documentation and see where it breaks down.
  2. From RAG to an agent - turn the search function into a tool, give it to an LLM, and let the LLM decide when and how to search. We use toyaikit.
  3. Agentic search - add a second tool (get_file) so the agent can search for snippets and then open the full document. This is the pattern that mirrors how humans actually read docs.
  4. PydanticAI - migrate the same agent to a production framework with multi-turn conversations and usage tracking.

The use case: Evidently AI documentation

Our knowledge base is the Evidently AI documentation - a real set of Markdown files that keeps changing as the library evolves. LLMs have a knowledge cutoff, but library documentation does not stand still. Plugging fresh docs into an LLM is one of the most common and useful applications of RAG.

Continue with Part 1: Building a classic RAG system to build the baseline.

Questions & Answers

Sign in to ask questions