Skip to content

📈 Evaluation & Scoring

Comprehensive Scoring Strategies

Multiple metrics for accurate evaluation

🎯 Scoring Strategies

Exact Match

from llm_evaluation_framework.evaluation.scoring_strategies import ExactMatchScorer

scorer = ExactMatchScorer()
score = scorer.score(prediction, reference)

Semantic Similarity

from llm_evaluation_framework.evaluation.scoring_strategies import SemanticSimilarityScorer

scorer = SemanticSimilarityScorer()
score = scorer.score(prediction, reference)

BLEU Score

from llm_evaluation_framework.evaluation.scoring_strategies import BLEUScorer

scorer = BLEUScorer()
score = scorer.score(prediction, reference)

📊 Metrics

Metric Use Case Range
Accuracy Overall performance 0-1
Precision False positives 0-1
Recall False negatives 0-1
F1 Score Balanced metric 0-1
Cost Economic efficiency $0+

View Advanced Scoring