Skip to content

BUG: Fix HDFStore.put with StringDtype columns and compression (#64180)#64569

Open
alubbock wants to merge 1 commit intopandas-dev:mainfrom
alubbock:fix/hdfstore-stringdtype-compression
Open

BUG: Fix HDFStore.put with StringDtype columns and compression (#64180)#64569
alubbock wants to merge 1 commit intopandas-dev:mainfrom
alubbock:fix/hdfstore-stringdtype-compression

Conversation

@alubbock
Copy link
Contributor

In GenericFixed.write_array, when compression filters are set, Atom.from_dtype(value.dtype) is called to determine the HDF5 atom type. StringDtype is an ExtensionDtype that lacks the .base attribute Atom.from_dtype expects, raising AttributeError.

Route BaseStringArray to the VLArray path before the Atom.from_dtype logic runs, and pass self._filters through to create_vlarray so that compression is still applied.

AI-assisted (Claude Code), but essentially it's just a reorganisation of the branching logic. A test has been added, and I've verified that the fix works with the example in #64180.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BUG: to_hdf on dataframe with string column failing with compression

1 participant