Skip to content

feat(language_model): add strict_mode option for structured output#50

Open
lelouar wants to merge 1 commit intoSynaLinks:mainfrom
lelouar:feat/strict-mode-structured-output
Open

feat(language_model): add strict_mode option for structured output#50
lelouar wants to merge 1 commit intoSynaLinks:mainfrom
lelouar:feat/strict-mode-structured-output

Conversation

@lelouar
Copy link
Copy Markdown

@lelouar lelouar commented Apr 23, 2026

Summary

Adds an opt-out for the hardcoded "strict": true field that LanguageModel currently injects into response_format for structured-output calls.

Problem

Some OpenAI-compatible servers reject unknown fields in response_format via Pydantic's extra_forbidden validator. Notably, NVIDIA's TensorRT-LLM (trtllm-serve) returns:

BadRequestError: [{'type': 'extra_forbidden',
 'loc': ('body', 'response_format', 'strict'),
 'msg': 'Extra inputs are not permitted'}]

litellm.drop_params = True does not strip this field, so the request fails regardless of the strict value (True or False — the key itself is forbidden).

Today, users hitting such a server have no way to use LanguageModel with structured output.

Change

Add a strict_mode constructor argument to LanguageModel:

  • Default: True — preserves current behavior, no breaking change.
  • False — omits the strict field entirely from the response_format payload.
# Default (unchanged)
lm = synalinks.LanguageModel(model="openai/gpt-4o-mini")

# For TensorRT-LLM and similar strict-field-rejecting servers
lm = synalinks.LanguageModel(
    model="hosted_vllm/nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-NVFP4",
    api_base="http://my-trtllm-server/v1",
    strict_mode=False,
)

Also wired through get_config() / from_config() for serialization roundtrip.

Scope

  • Scope is limited to the strict field. No rename/alias for other servers that might use a different parameter name — that can be a separate change if needed.
  • All 5 provider branches that set strict are covered uniformly.
  • No env vars introduced; configuration stays in the constructor like all other LanguageModel options.

Tests

Added 5 unit tests in synalinks/src/language_models/language_model_test.py:

  • test_strict_mode_default_true_hosted_vllm — verifies default still emits strict: true
  • test_strict_mode_false_omits_strict_hosted_vllm
  • test_strict_mode_false_omits_strict_openai (covers the inside-json_schema variant)
  • test_strict_mode_false_omits_strict_ollama
  • test_strict_mode_roundtrip_via_get_config

All 10 tests in the file pass (5 existing + 5 new).

Test plan

  • uv run pytest synalinks/src/language_models/language_model_test.py -v passes (10/10)
  • Default behavior preserved (existing tests unchanged)
  • Verified against a live TensorRT-LLM server (trtllm-serve with Nemotron-Nano) where strict_mode=False unblocks structured output calls

🤖 Generated with Claude Code

Some OpenAI-compatible servers (e.g. TensorRT-LLM's `trtllm-serve`)
reject unknown fields in `response_format` via Pydantic's
`extra_forbidden` validator. When synalinks sends the hardcoded
`"strict": true` field, these servers return:

    BadRequestError: [{'type': 'extra_forbidden',
     'loc': ('body', 'response_format', 'strict'), ...}]

Add a `strict_mode` constructor argument (default True, preserves
current behavior) that, when False, omits the `strict` field entirely
from the payload sent to the provider. Covers all 5 provider branches
that currently set strict: ollama/mistral, openai/azure, gemini, xai,
hosted_vllm.

Usage:

    # Default: strict enforced (unchanged)
    lm = LanguageModel(model="openai/gpt-4o-mini")

    # For servers that don't accept the `strict` field
    lm = LanguageModel(
        model="hosted_vllm/my-model",
        api_base="http://trtllm-server/v1",
        strict_mode=False,
    )

Added 5 unit tests covering default=True, strict_mode=False for
hosted_vllm/openai/ollama, and get_config/from_config roundtrip.
@YoanSallami
Copy link
Copy Markdown
Contributor

Are you sure that non strict mode is not the json mode ? which doesn't guarantee the format ?

@lelouar
Copy link
Copy Markdown
Author

lelouar commented Apr 25, 2026

It's seems to depend on model + LLM server + parameters. I can dig more into this to draw a clearer view. I only tried vLLM and trtllm so far but I can do more test next week

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants