PERF: fix slow repr for Series/DataFrame with third-party array-like objects#64638
Open
jbrockmendel wants to merge 1 commit intopandas-dev:mainfrom
Open
PERF: fix slow repr for Series/DataFrame with third-party array-like objects#64638jbrockmendel wants to merge 1 commit intopandas-dev:mainfrom
jbrockmendel wants to merge 1 commit intopandas-dev:mainfrom
Conversation
…objects Replace the duck-type is_sequence() check in pprint_thing() with an explicit isinstance allowlist. Previously, any object with __iter__ and __len__ (e.g. xarray DataArray) would be recursively iterated element-by-element via _pprint_seq, causing ~20,000 expensive repr calls for a single DataArray. Now only types that _pprint_seq was designed to handle are iterated; everything else uses str() directly. Closes pandas-dev#61809 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
32000d5 to
4db0b6c
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
is_sequence()check inpprint_thing()with an explicitisinstanceallowlist of types that_pprint_seqwas designed to handle.__iter__and__len__(e.g. xarrayDataArray) would be recursively iterated element-by-element, causing ~20,000 expensive repr calls for a single large DataArray stored in an object-dtype column.str(), using the object's own repr.Closes #61809
Test plan
test_printing.py,test_format.py,test_formats.py,test_repr.py,test_groupby.py,test_sorting.py,test_to_string.pyall pass🤖 Generated with Claude Code