Skip to content

Feature/trust score metric 17823921494002512507#2625

Open
Danish2op wants to merge 5 commits intoconfident-ai:mainfrom
Danish2op:feature/trust-score-metric-17823921494002512507
Open

Feature/trust score metric 17823921494002512507#2625
Danish2op wants to merge 5 commits intoconfident-ai:mainfrom
Danish2op:feature/trust-score-metric-17823921494002512507

Conversation

@Danish2op
Copy link
Copy Markdown

Description

This PR addresses issue #2586 by implementing a new TrustScoreMetric. This metric evaluates the trustworthiness of an LLM's response based on the sources used during RAG retrieval, categorized by customizable trust tiers.

DeepEval currently evaluates LLM outputs on dimensions like faithfulness, relevance, and hallucination. However, two responses can score identically on faithfulness but have completely different trust profiles depending on where they sourced their information (e.g., SEC filings vs. unverified blog posts). This new orthogonal dimension ensures users can accurately measure output trust based on their retrieval sources.

What was built

A new TrustScoreMetric class that:

  1. Accepts source_tiers and threshold: Takes a dictionary mapping source identifiers/keywords to tier numbers (T1=most trusted, T5=least trusted), and a success threshold float (default 0.7).
  2. Follows Standard DeepEval Interface: Implements both measure and a_measure functions on an LLMTestCase exactly like other base metrics and supports BaseMetric properties.
  3. Implements Accurate Scoring Logic:
    • Inspects test_case.retrieval_context.
    • Uses case-insensitive substring matching to map context chunks to user-provided source keys.
    • Maps match tiers to scores: T1=1.0, T2=0.8, T3=0.6, T4=0.4, T5=0.2. Unmatched sources receive a default neutral score of 0.5.
    • Computes the average of all chunk scores as the final trust score.
    • Produces a detailed human-readable reason string explaining which sources were found and their tiers.

Changes Made

  • Added the deepeval/metrics/trust_score directory with __init__.py and trust_score.py.
  • Exported the new metric gracefully in deepeval/metrics/__init__.py.
  • Added a full test suite tests/test_trust_score_metric.py validating various scenarios (high/low trust, mixed/unmatched sources, threshold pass/fail, empty retrieval contexts).

How to use

from deepeval.metrics import TrustScoreMetric
from deepeval.test_case import LLMTestCase

# Map sources to tiers (Tier 1 is most trusted, Tier 5 is least)
source_tiers = {
    "SEC Filings": 1,
    "Verified Blog": 2,
    "Unverified Post": 4
}

metric = TrustScoreMetric(source_tiers=source_tiers, threshold=0.7)

test_case = LLMTestCase(
    input="What is Apple's revenue?",
    actual_output="Apple's revenue is 394 billion.",
    retrieval_context=["According to SEC filings, Apple's revenue is 394 billion."]
)

metric.measure(test_case)
print(metric.score)   # 1.0
print(metric.reason)  # Explains the specific tier mapped for the chunk
print(metric.success) # True
Testing

poetry run ruff check and poetry run black successfully ran on changed files.
poetry run pytest tests/test_trust_score_metric.py runs with a 100% pass rate.

This commit introduces a new `TrustScoreMetric` which evaluates the trustworthiness
of an LLM response based on the sources used during RAG retrieval. The metric
takes a dictionary of source strings mapped to tier values (T1-T5), and scores
the sources appropriately. It exports the new metric in the `deepeval/metrics/__init__.py`
file and provides comprehensive test cases for varying trust tiers, thresholds,
and edge cases.
This commit introduces a new `TrustScoreMetric` which evaluates the trustworthiness
of an LLM response based on the sources used during RAG retrieval. The metric
takes a dictionary of source strings mapped to tier values (T1-T5), and scores
the sources appropriately. It exports the new metric in the `deepeval/metrics/__init__.py`
file and provides comprehensive test cases for varying trust tiers, thresholds,
and edge cases.
This commit introduces a new `TrustScoreMetric` which evaluates the trustworthiness
of an LLM response based on the sources used during RAG retrieval. The metric
takes a dictionary of source strings mapped to tier values (T1-T5), and scores
the sources appropriately. It exports the new metric in the `deepeval/metrics/__init__.py`
file and provides comprehensive test cases for varying trust tiers, thresholds,
and edge cases.
@vercel
Copy link
Copy Markdown

vercel Bot commented Apr 22, 2026

@Danish2op is attempting to deploy a commit to the Confident AI Team on Vercel.

A member of the Team first needs to authorize it.

This commit introduces a new `TrustScoreMetric` which evaluates the trustworthiness
of an LLM response based on the sources used during RAG retrieval. The metric
takes a dictionary of source strings mapped to tier values (T1-T5), and scores
the sources appropriately. It exports the new metric in the `deepeval/metrics/__init__.py`
file and provides comprehensive test cases for varying trust tiers, thresholds,
and edge cases.
This commit introduces a new `TrustScoreMetric` which evaluates the trustworthiness
of an LLM response based on the sources used during RAG retrieval. The metric
takes a dictionary of source strings mapped to tier values (T1-T5), and scores
the sources appropriately. It exports the new metric in the `deepeval/metrics/__init__.py`
file and provides comprehensive test cases for varying trust tiers, thresholds,
and edge cases.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant