Skip to content

OpenAI-compatible endpoints that don't strictly honor response_format crash all LLM analyzers (fenced/prose JSON) #69

@bmd1905

Description

@bmd1905

Summary

When SKILLSPECTOR_PROVIDER=openai points at an OpenAI-compatible endpoint (via OPENAI_BASE_URL) that does not strictly honor response_format: {type: json_object}, every LLM semantic analyzer fails and the scan aborts with a JSON decode error. Static/pattern analysis still runs, but all LLM-backed findings are lost.

The OpenAI-compatible surface is documented as supported ("any OpenAI-compatible URL", Ollama/vLLM/llama.cpp/"managed inference gateways"), but many such endpoints either ignore response_format or proxy a model that emits Markdown-fenced JSON or a prose report instead of a bare JSON object. langchain's with_structured_output(...) (json-mode) then receives non-JSON and raises.

Impact

  • Hard failure, not graceful degradation: the process exits non-zero (Error: 1 validation error for LLMAnalysisResult / Invalid JSON or Error: Expecting value: line 1 column 1 (char 0)).
  • Affects every analyzer that inherits LLMAnalyzerBase (per-file analyzers + the meta-analyzer), since they all rely on _structured_llm = self._llm.with_structured_output(self.response_schema).
  • Silent quality loss in the best case: users on a compatible-but-non-strict endpoint get pattern-only results while believing LLM analysis ran.

Reproduction

  1. Configure an OpenAI-compatible endpoint that does not enforce strict JSON mode:
    export SKILLSPECTOR_PROVIDER=openai
    export OPENAI_BASE_URL=https://<openai-compatible-endpoint>/v1
    export OPENAI_API_KEY=<key>
    export SKILLSPECTOR_MODEL=<model-served-there>
    skillspector scan ./some-skill/
  2. Observe the run abort during the semantic stage.

Probing such an endpoint directly shows the root cause — response_format: json_object is ignored and the body is fenced:

{"choices":[{"message":{"content":"```json\n{\"status\": \"ok\"}\n```"}}]}

Some models go further and return a full Markdown report ("# Security Analysis Report ... No findings") instead of JSON, so even fence-stripping a single block is not always enough.

Root cause

src/skillspector/llm_analyzer_base.py:

self._structured_llm = (
    self._llm.with_structured_output(self.response_schema) if self.response_schema else None
)

This delegates parsing to the provider/langchain json-mode path, which assumes the endpoint returns a clean JSON object. OpenAI-compatible endpoints that don't enforce that contract break it.

Suggested fix

Make structured parsing tolerant of OpenAI-compatible endpoints that don't strictly honor response_format:

  1. Inject an explicit, schema-derived "respond with a single JSON object only, no prose, no code fences" instruction into the prompt (so behavior is driven by the prompt, not solely by response_format).
  2. Before validating, strip a wrapping ```json ... ``` fence and, as a fallback, slice the outermost {...}/[...] when prose surrounds the payload.
  3. Then model_validate the recovered JSON against the existing Pydantic schema.

This keeps strict behavior for endpoints that already return clean JSON while recovering the large class of compatible-but-lenient endpoints. A working local patch wrapping with_structured_output with a fence-tolerant validator (plus a prompt-level JSON directive) resolved the failure across all analyzers in my testing; happy to open a PR.

Environment

  • SkillSpector v2.1.4
  • Provider: openai against a non-OpenAI, OpenAI-compatible endpoint (OPENAI_BASE_URL)
  • Failure reproduced on both per-file analyzers and the meta-analyzer

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions