Best for: Documents < 50 pages, Fast prototyping
Large docs, Multi-document, Caching
| Feature | Simple RAG | Enhanced RAG | Enhanced + Cache |
|---|---|---|---|
| Document Size | < 50 pages | Unlimited | Unlimited |
| Setup Complexity | ✓ Easy (1 API) | Moderate (3 APIs) | Moderate (3 APIs + Redis) |
| Multi-document | ✗ No | ✓ Yes | ✓ Yes |
| Precision | 85% | 95% | 95% |
| Cost/1K queries | $15 | $17.50 | $3.50 🎯 |
| Latency | 2-3s | 2.5-3.5s | <10ms (cached) |
| Repeated Queries | Full cost each time | Full cost each time | ✓ Free from cache |
| Evaluation Support | ✓ Yes | ✓ Yes | ✓ Yes |
| Best for | Prototypes | Production | High-volume production |
Automated groundtruth generation and 4-metric evaluation system
qa_generator.py
Run on test dataset
rag_evaluator.py
Answer supported by retrieved context
Directly addresses the question
Retrieval quality (right chunks)
Matches ground truth answer
50-100 Q&A pairs
Test both systems
4-metric scoring
Iterate on weak areas