Skip to content

feat(ffi): expose clustering domain metadata read path#2573

Open
chiinlquah wants to merge 2 commits into
delta-io:mainfrom
chiinlquah:chiinlquah/clustering-domain-ffi
Open

feat(ffi): expose clustering domain metadata read path#2573
chiinlquah wants to merge 2 commits into
delta-io:mainfrom
chiinlquah:chiinlquah/clustering-domain-ffi

Conversation

@chiinlquah
Copy link
Copy Markdown
Collaborator

Add get_clustering_domain_metadata, an FFI symbol that returns the JSON configuration of the table's delta.clustering domain (or NULL when absent). Mirrors get_domain_metadata but routes through Snapshot::get_domain_metadata_internal, bypassing the user-facing delta.*-prefix guard.

Motivation: writers produce a delta.clustering DomainMetadata when creating a clustered table, but the existing FFI get_domain_metadata rejects reads on delta.*-prefixed domains. There's no FFI-exposed way to recover what an FFI-exposed writer wrote. External engines composing their own resolved clustering descriptors end up with a one-way wall.

The internal Rust path (Snapshot::get_physical_clustering_columns) already does the right thing for in-process use; this just gives external engines the same access.

Tests:

  • test_get_clustering_domain_metadata -- clustered table; verifies the new symbol succeeds while get_domain_metadata still rejects.
  • test_get_clustering_domain_metadata_absent -- unclustered table; verifies NULL return.

cargo test -p delta_kernel_ffi --lib: 142/142 pass.
cargo clippy -p delta_kernel_ffi --all-targets -- -D warnings: clean.

What changes are proposed in this pull request?

How was this change tested?

Add `get_clustering_domain_metadata`, an FFI symbol that returns the
JSON configuration of the table's `delta.clustering` domain (or NULL
when absent). Mirrors `get_domain_metadata` but routes through
`Snapshot::get_domain_metadata_internal`, bypassing the user-facing
`delta.*`-prefix guard.

Motivation: writers produce a `delta.clustering` DomainMetadata when
creating a clustered table, but the existing FFI `get_domain_metadata`
rejects reads on `delta.*`-prefixed domains. There's no FFI-exposed way
to recover what an FFI-exposed writer wrote. External engines composing
their own resolved clustering descriptors end up with a one-way wall.

The internal Rust path (`Snapshot::get_physical_clustering_columns`)
already does the right thing for in-process use; this just gives
external engines the same access.

Tests:
- `test_get_clustering_domain_metadata` -- clustered table; verifies
  the new symbol succeeds while `get_domain_metadata` still rejects.
- `test_get_clustering_domain_metadata_absent` -- unclustered table;
  verifies NULL return.

`cargo test -p delta_kernel_ffi --lib`: 142/142 pass.
`cargo clippy -p delta_kernel_ffi --all-targets -- -D warnings`: clean.
@codecov
Copy link
Copy Markdown

codecov Bot commented May 16, 2026

Codecov Report

❌ Patch coverage is 99.41520% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 88.75%. Comparing base (1ca53fc) to head (b628509).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
ffi/src/domain_metadata.rs 99.37% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2573      +/-   ##
==========================================
+ Coverage   88.66%   88.75%   +0.08%     
==========================================
  Files         181      182       +1     
  Lines       61488    62184     +696     
  Branches    61488    62184     +696     
==========================================
+ Hits        54520    55190     +670     
+ Misses       4874     4854      -20     
- Partials     2094     2140      +46     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

- Gate clustering domain read on the ClusteredTable writer feature, matching
  Snapshot::get_physical_clustering_columns semantics.
- Centralize the constant: kernel exposes CLUSTERING_DOMAIN_NAME via the
  internal-api feature and a new Snapshot::get_clustering_domain_metadata
  method; FFI delegates to it.
- Tighten the public doc comment, drop the verbatim error-string quote, and
  note physical-vs-logical column-name semantics.
- Restructure tests as an rstest with cases for multi-column, nested path,
  tombstoned, latest-write-wins, and stale-domain-without-feature.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants