Skip to content

feat(linter): add format_type_mismatch rule to detect invalid format usage#2338

Open
Vaibhav701161 wants to merge 2 commits intosourcemeta:mainfrom
Vaibhav701161:feat/linter-format-type-mismatch
Open

feat(linter): add format_type_mismatch rule to detect invalid format usage#2338
Vaibhav701161 wants to merge 2 commits intosourcemeta:mainfrom
Vaibhav701161:feat/linter-format-type-mismatch

Conversation

@Vaibhav701161
Copy link
Copy Markdown
Contributor

Summary

This PR introduces a new lint rule: format_type_mismatch.

The rule detects schemas where the format keyword is used alongside a non-string type.

{
  "type": "integer",
  "format": "email"
}

In JSON Schema, format validation is defined for string instances. Using it with other types does not have any effect and is typically an authoring mistake.

This rule highlights such cases to improve schema correctness and clarity.

The rule emits a diagnostic when:

  • format exists
  • type exists
  • type is a single string value
  • type is not "string"

The diagnostic points to the format keyword location.

This rule is non auto-fixable because modifying either type or format would require assumptions about the author’s intent.


Implementation

The rule follows the existing alterschema linter architecture:

  • implemented as a header-only rule in
    src/extension/alterschema/linter/format_type_mismatch.h
  • registered in alterschema.cc
  • added to the SOURCES list in the alterschema CMake configuration

The rule checks:

  • schema node is an object
  • format is defined and is a string
  • type is defined and is a string
  • type != "string"

If these conditions are met the rule returns:

APPLIES_TO_KEYWORDS("format")

Design Notes

This rule complements the existing non_applicable_type_specific_keywords rule.

While the existing rule detects keywords that are not applicable to a given type, this rule provides a more explicit and focused diagnostic specifically for misuse of the format keyword.

This improves clarity for users by directly pointing out incorrect format usage rather than reporting it as a generic type incompatibility.

The rule intentionally skips cases where type is an array (e.g., ["string", "integer"]), since format may still apply when the instance is a string.

The rule applies across all supported drafts where both type and format are valid keywords.


Tests

Tests were added across all supported dialects using the existing lint testing utilities:

  • Draft 3, Draft 4, Draft 6, Draft 7
  • Draft 2019-09
  • Draft 2020-12

The following scenarios are covered:

  • Rule fires when format is used with non-string types

  • Rule does not fire when type is "string"

  • Rule does not fire when format is missing

  • Rule does not fire when type is missing

  • Rule does not fire when type is an array including "string"

  • Rule fires inside nested subschemas (properties, items, etc.)

  • Rule correctly handles $ref:

    • pointing to affected subschemas
    • pointing inside affected subschemas

Tests are structured consistently with existing linter rule tests and account for interactions with other rules where applicable.


Related Work

Refs: #1975

Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 9 files

@augmentcode
Copy link
Copy Markdown

augmentcode bot commented Apr 8, 2026

🤖 Augment PR Summary

Summary: Adds a new AlterSchema linter rule, format_type_mismatch, to catch schemas that specify a non-string type while also using format.

Changes:

  • Implemented a new header-only linter rule (src/extension/alterschema/linter/format_type_mismatch.h) that flags format usage when type is a single string value other than "string".
  • Registered the rule in the AlterSchema linter bundle (src/extension/alterschema/alterschema.cc).
  • Added the new header to the AlterSchema CMake sources list (src/extension/alterschema/CMakeLists.txt).
  • Added test coverage across multiple dialects (Draft 3/4/6/7, 2019-09, 2020-12), including nested subschemas and $ref scenarios.

Technical Notes: The rule is non-mutating (no auto-fix) and reports the issue at the keyword locations returned via APPLIES_TO_KEYWORDS.

🤖 Was this summary useful? React with 👍 or 👎

Copy link
Copy Markdown

@augmentcode augmentcode bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 2 suggestions posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

- apply clang-format to all modified files
- fix test expectations to align with existing rule interactions
- ensure full CI pipeline passes locally

Signed-off-by: Vaibhav mittal <[email protected]>
@Vaibhav701161
Copy link
Copy Markdown
Contributor Author

@jviotti , kindly review

@jviotti
Copy link
Copy Markdown
Member

jviotti commented Apr 10, 2026

Nice one! Though note that we recently move the alterschema module down to Blaze: https://github.com/sourcemeta/blaze/tree/main/src/alterschema. Do you mind sending it over there instead?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants