Skip to content

[Bug]: 3.0-dev lacks LOAD DATA and parquet compatibility fixes #24188

@LeftHandCold

Description

@LeftHandCold

Is there an existing issue for the same bug?

  • I have checked the existing issues.

Branch Name

3.0-dev

Commit ID

d1e78d8

Other Environment Information

- Hardware parameters: N/A (branch-compare audit)
- OS type: Darwin
- Others:
  - Compared `base/main` vs `base/3.0-dev`
  - Audit window: 2025-10-01 .. 2026-04-24

Actual Behavior

3.0-dev is behind main on a set of user-visible ingestion compatibility fixes. The key missing items in this bucket are: JSON casting during LOAD DATA, incorrect parallel decompression, AUTO_INCREMENT zero handling in ingestion paths, session-timezone-aware TIMESTAMP parsing during LOAD DATA, parquet basic/logical type support, and parquet decimal loading fixes.

Expected Behavior

3.0-dev should match main for user-visible LOAD DATA and parquet semantics.

Steps to Reproduce

1. Compare `base/main` and `base/3.0-dev` for PRs `#23825`, `#23620`, `#23588`, `#22929`, `#22714`, `#23567`.
2. Verify relevant missing markers on `3.0-dev`:
   - `git grep -nE 'json to JSON|unsupported cast from json to JSON' base/3.0-dev -- pkg/sql/plan/function/func_cast.go` -> no match
   - `git cat-file -e base/3.0-dev:pkg/sql/colexec/external/external_timestamp_timezone_test.go` -> missing
   - `git cat-file -e base/3.0-dev:test/distributed/cases/load_data/binary_decimal_conversion.sql` -> missing
3. Exercise representative cases on `3.0-dev`, for example:
   - `LOAD DATA` with JSON-typed columns
   - compressed input paths that previously used the wrong parallel decompression path
   - parquet files carrying decimal/date/datetime/timestamp logical types
   - parquet DECIMAL encoded as `FixedLenByteArray`
   - `LOAD DATA` for TIMESTAMP values under a non-default session timezone.

Additional information

Main-only reference commits:

  • 01330e70f8 fix load data report 'unsupported cast from json to JSON' (#23825)
  • 1ac80028b6 disable wrong parallel decompression for LOAD DATA (#23620)
  • 3dce54ae78 Fix AUTO_INCREMENT zero handling and tests (#23588)
  • 72791e25a1 Fix: LOAD DATA should use session timezone for TIMESTAMP parsing main (#22929)
  • d1c1369521 supported all parquet basic data type and simple logic data type. (#22714)
  • 5638e46e5a Fix Parquet DECIMAL type loading from FixedLenByteArray (#23567)

Core SQL type-system parity is tracked separately in #24184.

Metadata

Metadata

Assignees

Labels

kind/bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions