← Projects
Ask anything about D&D 5e rules — semantic search retrieves relevant passages from the DMG, PHB & MM, then Claude synthesizes a cited answer.
Index
Checking…
Claude
Checking…
RAG Status
Checking…
Indexing D&D rulebooks…
README.md
Ask natural language questions about D&D 5th Edition rules. Semantic search retrieves the most relevant passages from the Dungeon Master's Guide and Player's Handbook, then Claude AI synthesizes a cited answer grounded exclusively in the source text.
RAG Pipeline
- Embed question — encoded into a 384-dim vector by
all-MiniLM-L6-v2running locally, no API call, ~50ms - Semantic search — cosine similarity search across ~5,500 pre-indexed chunks from the DMG & PHB via
sqlite-vec; returns top-5 passages - Generate answer — retrieved passages sent to Claude Haiku with a strict prompt: answer only from the provided context and cite page numbers inline
- Stream + cite — response streams token-by-token via SSE; source citations show exactly which book and page each answer draws from
Tech Stack
- sentence-transformers —
all-MiniLM-L6-v2for local embeddings; 80MB, 384-dim, no API key needed - sqlite-vec — vector store as a single
.dbfile; zero infrastructure, survives restarts - Claude Haiku 3.5 — fast and cheap (~$0.001/query), 200K context window, streaming via Anthropic SDK
- pypdf — page-aware PDF extraction; each page treated as a chunk unit
- Flask — Blueprint with
/api/status,/api/chatSSE, and/api/anthropic-statusproxy endpoints; background thread handles indexing on startup
Knowledge Base
- DMG — Dungeon Master's Guide (5e): world-building, encounter design, magic items, monster creation
- PHB — Player's Handbook (5e): character classes, spells, combat rules, equipment
- MM — Monster Manual (5e): monster stat blocks, lore, creature abilities, and encounter guidance
Combined: ~8,400 chunks across all three books. Each chunk stores its source label and page number for citations.
Design Decisions
- Local embeddings over API — eliminates a second API dependency; ~50ms on CPU with no per-query cost
- sqlite-vec over ChromaDB/FAISS — ~8,000 vectors fits comfortably in a single SQLite file; no separate process or serialization complexity
- Top-5 chunks — ~7,500 chars of context per query; well within Haiku's 200K token window while keeping cost and latency low
- Haiku over Sonnet — for well-structured RAG prompts where the answer is in the context, Haiku performs comparably at 10× lower cost
- Backend proxy for API status — fetches
status.anthropic.com/api/v2/status.jsonserver-side to avoid CORS; synthesized RAG Status badge turns green only when both the index and Claude API are fully operational