Allow avro_schema_url property alongside partitioning#27490
Draft
denodo-research-labs wants to merge 1 commit intoprestodb:masterfrom
Draft
Allow avro_schema_url property alongside partitioning#27490denodo-research-labs wants to merge 1 commit intoprestodb:masterfrom
denodo-research-labs wants to merge 1 commit intoprestodb:masterfrom
Conversation
… property alongside partitioning.
Contributor
Reviewer's GuideAdjusts Hive AVRO table validation so avro_schema_url only restricts bucketing, adds a new product test suite for partitioned Avro tables using external schemas, and updates existing tests and error messages accordingly. Sequence diagram for CREATE TABLE AVRO with avro_schema_url and partitioning/bucketingsequenceDiagram
actor User
participant PrestoCoordinator
participant HiveMetadata
participant HiveMetastore
User->>PrestoCoordinator: CREATE TABLE ... format=AVRO, partitioned_by, avro_schema_url
PrestoCoordinator->>HiveMetadata: prepareTable(session, tableMetadata)
HiveMetadata->>HiveMetadata: getPartitionedBy(properties)
HiveMetadata->>HiveMetadata: getBucketProperty(properties)
HiveMetadata->>HiveMetadata: getAvroSchemaUrl(properties)
alt bucketProperty present AND avro_schema_url not null
HiveMetadata-->>PrestoCoordinator: throw PrestoException(NOT_SUPPORTED, bucketing not supported)
PrestoCoordinator-->>User: error Bucketing columns not supported when Avro schema url is set
else only partitioned_by present with avro_schema_url
HiveMetadata->>HiveMetastore: createTable(table)
HiveMetastore-->>PrestoCoordinator: success
PrestoCoordinator-->>User: table created successfully
end
Updated class diagram for HiveMetadata validation and Avro partitioned testsclassDiagram
class HiveMetadata {
- Table prepareTable(ConnectorSession session, ConnectorTableMetadata tableMetadata)
- List~String~ getPartitionedBy(Map~String,Object~ properties)
- Optional~HiveBucketProperty~ getBucketProperty(Map~String,Object~ properties)
- String getAvroSchemaUrl(Map~String,Object~ properties)
}
class HiveBucketProperty {
}
class TestAvroPartitioned {
+ void testCreatePartitionedAvroTableWithSchemaUrl()
+ void testBucketedAvroTableWithSchemaUrlFails()
}
TestAvroPartitioned ..> HiveMetadata : uses
HiveMetadata o--> HiveBucketProperty
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
steveburnett
approved these changes
Apr 2, 2026
Contributor
steveburnett
left a comment
There was a problem hiding this comment.
LGTM! (docs)
Pull branch, local doc build, looks good. Thanks!
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
This PR enables the creation of Hive tables in AVRO format using the
avro_schema_urlproperty in conjunction with partitioning.Previously, providing an external Avro schema URL blocked the use of partitioning. This change updates the validation logic to allow
partitioned_bycolumns to coexist with an external schema URL.Example:
Motivation and Context
The
prepareTablemethod inHiveMetadatathrew aNOT_SUPPORTEDerror if either bucketing or partitioning was present with an Avro schema URL.Impact
If a user attempts to create a partitioned Hive table using an external Avro schema URL, the operation fails with a
PrestoException. This PR fixes the validation logic inHiveMetadatato allow partitioning while still restricting bucketing.Calling
CREATE TABLEwith bothpartitioned_byandavro_schema_urlcurrently throws:com.facebook.presto.common.type.PrestoException: Bucketing/Partitioning columns not supported when Avro schema url is set
This change will allow users to create partitioned Hive tables using an external Avro schema URL. The logic in
HiveMetadatawill be updated to specifically target bucketing for the restriction, enabling partitioning support.Before:
After:
Test Plan
Verified the fix by:
avro_schema_urland confirming it no longer throwsNOT_SUPPORTED.avro_schema_urlstill correctly throws the expectedPrestoException.Contributor checklist
Release Notes
Please follow release notes guidelines and fill in the release notes below.