Quick Start¶

This quick start runs a local development eval and shows where RAGhelm’s release-control artifacts fit. Local output is useful for development mechanics, but it is not production release evidence.

1. Install dependencies¶

uv sync --extra dev

For the dashboard demo:

cd packages/dashboard
pnpm install
cd ../..

2. Run a local eval smoke test¶

uv run python -m raghelm eval --suite quick

This runs 10 examples from the golden dataset in local mode and:

validates the dataset schema
runs deterministic local retrieval/generation paths
computes retrieval metrics: Recall@5, MRR, NDCG@5
scores generation faithfulness, relevance, completeness, and overall quality
saves a JSON result under data/eval_results/

Local mode may use mocks or offline heuristics. Do not use local eval output for production claims, README badges, or public proof.

3. Query the canonical Python pipeline¶

The current canonical AI runtime is Python FastAPI. /query returns the shared QueryResult contract from raghelm/contracts.py; production eval uses the same ReferenceQueryPipeline implementation.

uv run uvicorn raghelm.api.server:app --port 3000 --reload
curl -X POST http://localhost:3000/query \
  -H 'Content-Type: application/json' \
  -d '{"query": "What release evidence is required before this RAG change can ship?", "top_k": 5}'

Local API queries default to explicit stub generation (RAGHELM_GENERATION_MODE=stub) and may return no retrieved context if Pinecone is not configured. That output demonstrates mechanics only; it is not production proof. To require hosted generation, set RAGHELM_GENERATION_MODE=provider and OPENAI_API_KEY. To require production pipeline behavior, set RAGHELM_PIPELINE_MODE=production plus the required Pinecone configuration; missing config returns a fail-closed error.

4. Understand the output¶

Example result shape:

{
  "suite": "quick",
  "total_examples": 10,
  "metrics": {
    "recall@5": 1.0,
    "mrr": 1.0,
    "ndcg@5": 1.0
  },
  "generation_scores": {
    "faithfulness": 2.48,
    "relevance": 5.0,
    "completeness": 5.0,
    "overall": 4.16
  }
}

Treat this as eval evidence input. Production release decisions require a ReadinessScorecard linked to a valid RAGRunManifest.

5. Run the local release-control dashboard demo¶

cd packages/dashboard
pnpm dev
# Open http://localhost:5173

The demo dashboard shows a local ReadinessScorecard and linked RAGRunManifest. It demonstrates the release-control object model from ADR-002.

6. Try ingestion safely¶

Use dry-run mode before writing vectors to the Pinecone reference backend:

uv run python -m raghelm ingest ./knowledge --dry-run --namespace default

For Pinecone-backed reference runtime behavior, ADR-004 owns embedding dimensions and ADR-005 owns the Pinecone reference backend namespace strategy.

Next steps¶

Read the System Overview
Review the Architecture Decision Records
Learn about the golden dataset
Read the metrics reference
See Running Evaluations for production mode
See the CLI reference for ingestion and eval options