Where to go from here

We built a search engine from scratch, progressing from keyword matching to semantic embeddings. The concepts here scale to real systems. This page covers the tools and techniques that make that scaling possible.

The minsearch library

The TextSearch class we built became minsearch, a production-ready text search library. It adds convenience features on top of the same TF-IDF approach: appendable indices, multiple filter fields, and a cleaner API. Install it with:

uv add minsearch

If you want a search engine for a small-to-medium dataset without setting up a database, minsearch is the practical version of what we built here.

Inverted indexes

Our implementation computes similarity against every document, which works for a few thousand FAQ entries but does not scale to millions. An inverted index maps each word to the list of documents that contain it, so the search engine only looks at documents that have at least one matching term. This is how every major text search engine works under the hood.

Vector search at scale

For vector embeddings, comparing a query against every document vector is also linear in the number of documents. Two techniques make vector search fast:

  • LSH (Locality-Sensitive Hashing) uses random projections to group similar vectors into the same bucket. The search only checks vectors in the same bucket as the query, skipping the rest.
  • Product quantization compresses vectors into shorter codes, trading a small amount of accuracy for much faster distance computation.

Tools and databases

For real projects, use established tools instead of building from scratch:

Each of these handles the indexing, storage, and retrieval concerns that we skipped for clarity.

Follow-up: Agentic RAG

The search engine we built retrieves FAQ entries. The natural next step is feeding those results into a language model to generate answers. The From RAG to Agents workshop picks up where this one leaves off: it starts with classic RAG over the same FAQ data and then evolves it into an agentic search workflow where the LLM decides what to search for and whether to open a full document.

Questions & Answers

Sign in to ask questions