Skip to content

[SPARK-57418][DOCS] Add singleVariantColumn option to CSV, JSON, and XML data source options tables#56469

Open
brijrajk wants to merge 1 commit into
apache:masterfrom
brijrajk:SPARK-57418-singlevariantcolumn-docs
Open

[SPARK-57418][DOCS] Add singleVariantColumn option to CSV, JSON, and XML data source options tables#56469
brijrajk wants to merge 1 commit into
apache:masterfrom
brijrajk:SPARK-57418-singlevariantcolumn-docs

Conversation

@brijrajk

Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

Added the missing singleVariantColumn option to the Data Source Options tables in:

  • docs/sql-data-sources-csv.md
  • docs/sql-data-sources-json.md
  • docs/sql-data-sources-xml.md

The option was introduced in Spark 4.1.0 (SPARK-51298 for CSV, also supported for JSON and XML) but was never documented in the reference tables. It is defined as a shared constant in DataSourceOptions.scala and consumed by CSVOptions, JSONOptions, and XmlOptions.

Why are the changes needed?

Users have no way to discover singleVariantColumn from the official data source options reference. The option allows ingesting an entire CSV/JSON/XML record as a single VariantType column instead of parsing it into individual fields — a key use case for the Variant type introduced in Spark 4.0.

Does this PR introduce any user-facing change?

No. Documentation only.

How was this patch tested?

No code change — documentation only. Verified the option is defined in DataSourceOptions.scala (line 78), CSVOptions.scala (line 338), JSONOptions.scala (line 215), and XmlOptions.scala (line 194).

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Claude (Anthropic)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant