Skip to content

RAGhelm

Scorer

greynewell/raghelm

Scorer API¶

Generation scoring estimates answer quality for eval runs. Scores feed release evidence but are not a release decision by themselves.

`score_generation`¶

from raghelm.eval.scorer import score_generation

scores = score_generation(
    question="What happens at 0 STR?",
    answer="The character is DEAD.",
    context="When a character reaches 0 STR, they are DEAD.",
    expected_answer="The character is DEAD.",
)

Example shape:

{"faithfulness": 5.0, "relevance": 5.0, "completeness": 5.0, "overall": 5.0}

Local/dev scoring may use deterministic heuristics. Production judge mode should fail closed when required LLM provider configuration is missing.