We built a search engine from scratch, progressing from keyword matching to semantic embeddings. The concepts here scale to real systems. This page covers the tools and techniques that make that scaling possible.

The minsearch library

The TextSearch class we built became minsearch, a production-ready text search library. It adds appendable indices, multiple filter fields, and a cleaner API on top of the same TF-IDF approach.

Install it with:

uv add minsearch

For search over a small-to-medium dataset, minsearch is the practical version of what we built here.

Inverted indexes

Our implementation computes similarity against every document. That works for a few thousand FAQ entries, but does not scale to millions. An inverted index maps each word to the list of documents that contain it. Then the search engine only looks at documents that have at least one matching term. This is how every major text search engine works under the hood.

Vector search at scale

For vector embeddings, comparing a query against every document vector is also linear in the number of documents.

Two techniques make vector search fast:

LSH (Locality-Sensitive Hashing) uses random projections to group similar vectors into the same bucket. The search only checks vectors in the same bucket as the query.
Product quantization compresses vectors into shorter codes. It trades a small amount of accuracy for much faster distance computation.

Tools and databases

For real projects, use established tools instead of building from scratch:

Elasticsearch (built on Lucene) for text search with inverted indexes
FAISS for fast vector similarity search
Qdrant, Weaviate, or Chroma as dedicated vector databases

Each of these handles the indexing, storage, and retrieval concerns that we skipped for clarity.

Follow-up: Agentic RAG

The search engine we built retrieves FAQ entries. The natural next step is to feed those results into a language model. The model can then generate answers.

The From RAG to Agents workshop picks up where this one leaves off. It starts with classic RAG over the same FAQ data. Then it evolves into an agentic workflow. The LLM decides what to search for and whether to open a full document.

Where to go from here

The minsearch library

Inverted indexes

Vector search at scale

Tools and databases

Follow-up: Agentic RAG

Questions & Answers

Where to go from here

The minsearch library

Inverted indexes

Vector search at scale

Tools and databases

Follow-up: Agentic RAG

Questions & Answers (0)

Questions & Answers