CLI Commands¶

The raghelm CLI currently exposes evaluation and Pinecone reference-backend ingestion. CLI output can feed release-control evidence, but local/demo output must not be presented as production proof.

Contract-freeze command namespace¶

The investable-MVP workflow freezes these command names so parallel workstreams do not invent competing entry points:

Command	Status	Purpose
`uv run python -m raghelm eval`	Implemented	Run local or production-capable evals.
`uv run python -m raghelm ingest`	Implemented	Ingest or dry-run source documents into the reference backend.
`uv run python -m raghelm query`	Reserved	Run one query through the canonical `QueryPipeline` and emit `QueryResult`.
`uv run python -m raghelm release check`	Reserved	Emit `ReadinessScorecard` + `RAGRunManifest` and enforce `ReleaseGate`.

Reserved commands must not be implemented with provider-specific or lane-local response shapes. They should emit or consume the shared contracts in raghelm/contracts.py.

Canonical query runtime¶

The implemented query runtime is the Python FastAPI /query endpoint, not a CLI command yet:

uv run uvicorn raghelm.api.server:app --port 3000 --reload
curl -X POST http://localhost:3000/query \
  -H 'Content-Type: application/json' \
  -d '{"query": "What changed in this release?", "top_k": 5}'

/query returns the shared QueryResult contract, including routing, cost, latency, retrieved_chunks, and TargetAnswer. Stub generation is explicit and default (RAGHELM_GENERATION_MODE=stub). Provider-backed generation is available behind configuration (RAGHELM_GENERATION_MODE=provider, OPENAI_API_KEY) and fails closed when required credentials are missing. Production eval uses the same ReferenceQueryPipeline implementation.

`eval`¶

Run the evaluation suite against the golden dataset.

uv run python -m raghelm eval [OPTIONS]

Flag	Values	Purpose
`--suite`	`quick`, `full`	Run 10 examples or the full dataset.
`--mode`	`local`, `production`	Choose local deterministic/dev behavior or hosted Pinecone/LLM behavior.
`--judge-mode`	`offline_heuristic`, `production`	Choose local heuristic scoring or production LLM-as-judge scoring.
`--smoke`	boolean flag	Validate production configuration without running the full suite.

Examples:

# Local deterministic/dev eval
uv run python -m raghelm eval --suite quick --mode local --judge-mode offline_heuristic

# Production eval path, requires production credentials
uv run python -m raghelm eval --suite full --mode production --judge-mode production

# Production smoke test
uv run python -m raghelm eval --mode production --judge-mode production --smoke

Production release decisions should be represented by scorecards and manifests, not raw local eval output.

`ingest`¶

Ingest .md and .txt files into the Pinecone reference backend, or simulate ingestion in dry-run mode.

uv run python -m raghelm ingest [OPTIONS] SOURCE

Argument	Description
`SOURCE`	Directory or file containing `.md` / `.txt` source documents.

Flag	Default	Description
`--namespace`, `-n`	`default`	Target Pinecone namespace inside the selected compatible index.
`--chunk-size`	`512`	Maximum characters per chunk.
`--dry-run`	`false`	Simulate without calling embedding or vector DB APIs.

Examples:

# Safe preview
uv run python -m raghelm ingest ./docs --dry-run --namespace default

# Pinecone-backed reference runtime ingestion
uv run python -m raghelm ingest /path/to/knowledge-base   --namespace semantic-512-small   --chunk-size 512

Evidence and safety expectations¶

The ingestion path supports the Pinecone compatibility profile from ADR-005:

deterministic vector IDs
chunk checksums in metadata
explicit backend mode: cloud, local, auto, or dry-run
namespace-level stats where available
dry-run mode for safe simulation
production fail-closed behavior when required config is missing

Audit logs are written to data/ingestion_audit.json after ingestion runs.