Where to go from here
We built a search engine from scratch, progressing from keyword matching to semantic embeddings. The concepts here scale to real systems. This page covers the tools and techniques that make that scaling possible.
The minsearch library
The TextSearch class we built became minsearch, a production-ready text search library. It adds convenience features on top of the same TF-IDF approach: appendable indices, multiple filter fields, and a cleaner API. Install it with:
uv add minsearch
If you want a search engine for a small-to-medium dataset without setting up a database, minsearch is the practical version of what we built here.
Inverted indexes
Our implementation computes similarity against every document, which works for a few thousand FAQ entries but does not scale to millions. An inverted index maps each word to the list of documents that contain it, so the search engine only looks at documents that have at least one matching term. This is how every major text search engine works under the hood.
Vector search at scale
For vector embeddings, comparing a query against every document vector is also linear in the number of documents. Two techniques make vector search fast:
- LSH (Locality-Sensitive Hashing) uses random projections to group similar vectors into the same bucket. The search only checks vectors in the same bucket as the query, skipping the rest.
- Product quantization compresses vectors into shorter codes, trading a small amount of accuracy for much faster distance computation.
Tools and databases
For real projects, use established tools instead of building from scratch:
- Elasticsearch (built on Lucene) for text search with inverted indexes
- FAISS for fast vector similarity search
- Qdrant, Weaviate, or Chroma as dedicated vector databases
Each of these handles the indexing, storage, and retrieval concerns that we skipped for clarity.
Follow-up: Agentic RAG
The search engine we built retrieves FAQ entries. The natural next step is feeding those results into a language model to generate answers. The From RAG to Agents workshop picks up where this one leaves off: it starts with classic RAG over the same FAQ data and then evolves it into an agentic search workflow where the LLM decides what to search for and whether to open a full document.