Skip to content

Conversation

@qlfyd123
Copy link

@qlfyd123 qlfyd123 commented Jan 26, 2026

https://hibernate.atlassian.net/browse/HHH-20092
[Please describe here what your change is about]

Added a descriptor to DdlTypeRegistry to map json to nvarchar.


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license
and can be relicensed under the terms of the LGPL v2.1 license in the future at the maintainers' discretion.
For more information on licensing, please check here.



Please make sure that the following tasks are completed:
Tasks specific to HHH-20092 (Bug):

  • Add test reproducing the bug
  • Add entries as relevant to migration-guide.adoc OR check there are no breaking changes

Copy link
Member

@mbellade mbellade left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @qlfyd123, left a couple comments on how to approach this.

Comment on lines 1278 to 1283
@Override
public boolean equivalentTypes(int typeCode1, int typeCode2) {
return typeCode1 == Types.NVARCHAR && typeCode2 == SqlTypes.JSON
|| typeCode1 == SqlTypes.JSON && typeCode2 == Types.NVARCHAR
|| super.equivalentTypes( typeCode1, typeCode2 );
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is better, though I believe this issue might not be specific to SQL Server. Could you please check whether it happens with other DBs? The "problem" might be that we always default to non nationalized varchar columns for SqlTypes.JSON, whereas we should respect the hibernate.use_nationalized_character_data config property.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me that it's better to change org.hibernate.tool.schema.internal.ColumnDefinitions#hasMatchingType to use column.getType().getJdbcType().getDdlTypeCode() instead of column.getSqlTypeCode( metadata ), since we're explicitly trying to check the DDL type code as reported by the database here.

Copy link
Author

@qlfyd123 qlfyd123 Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have investigated how JSON types are handled across various databases and found a specific issue with SQL Server regarding the use_nationalized_character_data configuration.

Here is the current behavior observed for different databases:

Database Expected (Hibernate)
MariaDB json (Types#JSON)
MySQL json (Types#JSON)
Oracle json (Types#JSON)
PostgreSQL jsonb (Types#JSON)
SQL Server varchar(max) (Types#LONG32VARCHAR)

For most databases (MySQL, PostgreSQL, etc.), the dialect explicitly registers a descriptor for SqlTypes.JSON (e.g., ddlTypeRegistry.addDescriptor(new DdlTypeImpl(JSON, "json", this))), so they consistently use the native JSON
type.

However, SQL Server Dialect does not explicitly register a descriptor for SqlTypes.JSON. As a result, it falls back to the default registration in MetadataBuildingProcess.java:

   // MetadataBuildingProcess.java
    jdbcTypeRegistry.addDescriptorIfAbsent( JsonAsStringJdbcType.VARCHAR_INSTANCE );

The JsonAsStringJdbcType.VARCHAR_INSTANCE is defined as:

    public static final JsonAsStringJdbcType VARCHAR_INSTANCE = new JsonAsStringJdbcType( SqlTypes.LONG32VARCHAR, null );

And its constructor sets nationalized to false:

protected JsonAsStringJdbcType(int ddlTypeCode, EmbeddableMappingType embeddableMappingType) {
    super( embeddableMappingType );
    this.ddlTypeCode = ddlTypeCode;
    // returns false for LONG32VARCHAR
    this.nationalized = ddlTypeCode == SqlTypes.LONG32NVARCHAR || ddlTypeCode == SqlTypes.NCLOB;
  }

This means that for SQL Server, SqlTypes.JSON is mapped to a type that uses LONG32VARCHAR (mapped to varchar(max)) and has nationalized = false by default.

I've use column.getType().getJdbcType().getDdlTypeCode() method instead, but it returned same result type code 4001 long32varchar

I think when users apply @JdbcTypeCode(SqlTypes.JSON), Hibernate uses this registered instance directly, bypassing the adjustment logic that would normally check the use_nationalized_character_data setting. Consequently, the setting is ignored, and varchar(max) is always used instead of nvarchar(max).

@qlfyd123
Copy link
Author

@mbellade
I’ve re-examined the issue and identified the root cause: the Dialect was missing a descriptor registration, causing Hibernate to always interpret SQL Server’s JSON as a VARCHAR type. Consequently, I have removed the previous logic and added the appropriate descriptor to the SQL Server Dialect.

@qlfyd123 qlfyd123 requested a review from mbellade January 29, 2026 02:06
typeContributions.contributeJdbcType( SQLServerCastingXmlJdbcType.INSTANCE );
typeContributions.contributeJdbcType( UUIDJdbcType.INSTANCE );
typeContributions.contributeJdbcTypeConstructor( SQLServerCastingXmlArrayJdbcTypeConstructor.INSTANCE );
typeContributions.contributeJdbcType( JsonAsStringJdbcType.VARCHAR_INSTANCE );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this the default behavior? If the dialect doesn't explicitly register a JSON JDBC type, this should be added in org.hibernate.boot.model.process.spi.MetadataBuildingProcess#handleTypes.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mbellade
I apologize, I think I made a few mistakes earlier. After re-checking the logic, I found that in the InferredBasicValueResolver#from method:

// explicitJavaType is null
// explicitJdbcType is not null
// reflectedJtd is not null

Due to these conditions, only the following code block is actually executed:

else if ( explicitJdbcType != null ) {
    // we also have an explicit JdbcType
    jdbcMapping = basicTypeRegistry.resolve( reflectedJtd, explicitJdbcType );
}

The problem seems to be that in this specific code block, the resolveIndicatedType method implemented in JsonAsStringJdbcType is not being executed. Consequently, the value of JsonAsStringJdbcType.VARCHAR_INSTANCE, which was registered during the MetadataBuildingProcess, is being used as-is.

@qlfyd123
Copy link
Author

qlfyd123 commented Feb 3, 2026

Hi @mbellade.
I introduced logic to apply AdjustableJdbcType adaptation even when an explicit JdbcType is provided in InferredBasicValueResolver.
The primary goal was to ensure that JDBC types correctly adapt to metadata like @Nationalized or global property
However, this change causes a regression in LengthTest#testSqlType

  • Cause: LongVarcharJdbcType (used for LONG32VARCHAR) implements AdjustableJdbcType. When resolveIndicatedType is called without an explicit length (using default Length.LONG), it downgrades the type code from LONG32VARCHAR
    to the standard LONGVARCHAR.
  • Result: H2 Dialect maps LONGVARCHAR to VARCHAR(32600) instead of CLOB, causing a "Value too long" exception when inserting large data. Before this change, the explicit LONG32VARCHAR was preserved, leading to a correct CLOB
    mapping.

I would appreciate guidance on the correct design direction:
Is it the intended behavior to apply AdjustableJdbcType logic to explicitly defined JDBC types?
If we should NOT apply adaptation logic to explicit types, how should we handle JSON on SQL Server? Should we treat it like other DBs and use a single fixed type (e.g., always NVARCHAR(MAX)) regardless of the nationalized attribute, effectively ignoring the attribute for explicit JSON types?

I'm happy to adjust the implementation based on your feedback. For now, I've pushed the changes to facilitate discussion despite the test failure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants