Skip to content

Conversation

@nagkumar91
Copy link
Member

Fixes #43842

  • QAEvaluator now forwards is_reasoning_model to its LLM-based sub-evaluators (Groundedness/Relevance/Coherence/Fluency/Similarity), preventing max_tokens from being sent to reasoning models.
  • Added unit tests to verify propagation.

Tests:

  • python -m pytest sdk/evaluation/azure-ai-evaluation/tests/unittests/test_qa_evaluator.py -q

Copilot AI review requested due to automatic review settings January 13, 2026 18:14
@nagkumar91 nagkumar91 requested a review from a team as a code owner January 13, 2026 18:15
@github-actions github-actions bot added the Evaluation Issues related to the client library for Azure AI Evaluation label Jan 13, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request addresses issue #43842 by enabling the QAEvaluator to forward the is_reasoning_model parameter to its LLM-based sub-evaluators. This prevents max_tokens from being sent to reasoning models (o1/o3) which don't support that parameter.

Changes:

  • Modified QAEvaluator to extract is_reasoning_model from kwargs and pass it to all LLM-based sub-evaluators (Groundedness, Relevance, Coherence, Fluency, Similarity)
  • Added comprehensive unit tests to verify the parameter is correctly propagated to sub-evaluators
  • Added an unrelated change to set a default value for image_tag in AzureOpenAIPythonGrader

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluators/_qa/_qa.py Extracts is_reasoning_model from kwargs and forwards it to all LLM-based sub-evaluators
sdk/evaluation/azure-ai-evaluation/tests/unittests/test_qa_evaluator.py Adds unit tests verifying is_reasoning_model propagation to sub-evaluators
sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_aoai/python_grader.py Unrelated change: adds default value "2025-05-08" for image_tag parameter
sdk/evaluation/azure-ai-evaluation/tests/unittests/test_aoai_python_grader.py Unrelated change: adds test for image_tag default value

pass_threshold: float,
source: str,
image_tag: Optional[str] = None,
image_tag: Optional[str] = "2025-05-08",
Copy link

Copilot AI Jan 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change appears unrelated to the main PR purpose of passing is_reasoning_model through QAEvaluator. The addition of a default value for image_tag should ideally be in a separate PR. Additionally, the default date "2025-05-08" appears to be in the past (current date is January 2026), which may indicate this is an outdated value or there's a version/tag naming convention that isn't immediately clear.

Copilot uses AI. Check for mistakes.
@github-actions
Copy link

github-actions bot commented Jan 13, 2026

API Change Check

APIView identified API level changes in this PR and created the following API reviews

azure-ai-evaluation

@nagkumar91 nagkumar91 force-pushed the qa-evaluator-reasoning-model-fix branch from 8b62a14 to d3048ce Compare January 13, 2026 22:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Evaluation Issues related to the client library for Azure AI Evaluation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

QAEvaluator does not pass through is_reasoning_model

1 participant