Skip to content

[c++] Add MAP/ROW types, unify complex types under Value handle#611

Open
fresh-borzoni wants to merge 1 commit into
apache:mainfrom
fresh-borzoni:feat/cpp-nested-map-row
Open

[c++] Add MAP/ROW types, unify complex types under Value handle#611
fresh-borzoni wants to merge 1 commit into
apache:mainfrom
fresh-borzoni:feat/cpp-nested-map-row

Conversation

@fresh-borzoni

@fresh-borzoni fresh-borzoni commented Jun 7, 2026

Copy link
Copy Markdown
Member

closes #604
closes #605

Adds MAP and ROW column support to the C++ client, bringing it in line with the Rust core and Python.

The substantive change is on the read side. ARRAY reads went through a per-type accessor family hung off each result type, and extending that to MAP and ROW would have meant a per-type cross-product across three result containers (scan, lookup, prefix lookup). Instead, any complex column is now read through a single recursive Value handle: navigate with At / KeyAt / ValueAt / Field, read leaves with the Get* methods. That replaced the old ArrayView/MapView/RowValue types everywhere and nests to any depth.

For schema and writes, you declare nested columns with DataType::Map / DataType::Row and build values with the existing writers - no Arrow in user code. Schema::FromArrow stays as an escape hatch for callers that already have an arrow::Schema, internally the native types are lowered to Arrow when the table is created.
The zero-copy scan path and the Arrow bulk path are left as-is.

  • examples are left as a follow-up

@fresh-borzoni fresh-borzoni marked this pull request as draft June 7, 2026 16:04
@fresh-borzoni fresh-borzoni force-pushed the feat/cpp-nested-map-row branch 4 times, most recently from c385226 to ca881ca Compare June 7, 2026 23:33
@fresh-borzoni fresh-borzoni force-pushed the feat/cpp-nested-map-row branch from ca881ca to c816666 Compare June 8, 2026 01:31
@fresh-borzoni fresh-borzoni marked this pull request as ready for review June 8, 2026 01:43
@fresh-borzoni

Copy link
Copy Markdown
Member Author

@leekeiabstraction leekeiabstraction left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TY for the PR. Reviewing this in parts; will continue the rest tomorrow. Left a clarification question.

for (const auto& col : descriptor.schema.columns) {
if (detail::is_compound(col.data_type)) {
arrow_schema = detail::columns_to_arrow_schema(descriptor.schema.columns);
break;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying to understand why we break here, what if multiple columns are of row/map type?

@fresh-borzoni fresh-borzoni Jun 9, 2026

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's just a detection loop here, "is any datatype is complex/nested here in schema?", once the answer is -yes, we can short-circuit.

@leekeiabstraction leekeiabstraction left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR. Left some comments / questions.

ASSERT_OK(adm.DropTable(table_path, false));
}

TEST_F(KvTableTest, LookupWithValueHandle) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like a lot of the checks in LookupWithValueHandle and LookupComplexTypesMatrix are similar. Is it worth refactoring to minimise code to maintain?

ASSERT_OK(adm.DropTable(table_path, false));
}

TEST_F(LogTableTest, ArrayWriterOverflowDetection) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need MapWriterOverflowDetection test case as well?

Comment on lines +246 to +251
arrow::field("m_str_int", arrow::map(arrow::utf8(), arrow::int32())),
arrow::field("m_str_row", arrow::map(arrow::utf8(), row_seq_label)),
arrow::field("m_str_map",
arrow::map(arrow::utf8(), arrow::map(arrow::utf8(), arrow::int32()))),
arrow::field("m_str_arr", arrow::map(arrow::utf8(), arrow::list(arrow::int32()))),
arrow::field("arr_map", arrow::list(arrow::map(arrow::utf8(), arrow::int32()))),

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we have coverage for small and tiny ints?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[C++] Row data type support [c++] Map data type support

2 participants