Scorer API¶
score_generation¶
from raghelm.eval.scorer import score_generation
scores = score_generation(
question="What happens at 0 STR?",
answer="The character is DEAD.",
context="When a character reaches 0 STR, they are DEAD.",
expected_answer="The character is DEAD."
)
# {"faithfulness": 5.0, "relevance": 5.0, "completeness": 5.0, "overall": 5.0}
Uses keyword-matching heuristics for mock scoring. In production, this calls an LLM-as-judge.