Skip to content

[SPARK-57504][CORE] Avoid a potential NullPointerException in IndexShuffleBlockResolver.getChecksums when the checksum file cannot be opened#56564

Open
LuciferYang wants to merge 1 commit into
apache:masterfrom
LuciferYang:core-getChecksums-npe-fix
Open

[SPARK-57504][CORE] Avoid a potential NullPointerException in IndexShuffleBlockResolver.getChecksums when the checksum file cannot be opened#56564
LuciferYang wants to merge 1 commit into
apache:masterfrom
LuciferYang:core-getChecksums-npe-fix

Conversation

@LuciferYang

@LuciferYang LuciferYang commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

IndexShuffleBlockResolver.getChecksums opens the checksum file with var in: DataInputStream = null inside a try and closes it in finally { in.close() }. The method already catches IOException/EOFException to return null, so it is meant to degrade gracefully when the file cannot be read. But the finally is not null-safe: if the failure originates in the stream constructor (before in is assigned), it dereferences null and throws a NullPointerException that masks the original error.

The two sibling methods in the same file already handle constructor failure correctly: checkIndexAndDataFile constructs the stream in its own try/catch, and getMergedBlockData uses Utils.tryWithResource. This change makes getChecksums consistent by using Utils.tryWithResource, which evaluates the constructor before its own try, so a constructor IOException reaches the existing catch and the method returns null as intended.

Why are the changes needed?

getChecksums has a latent NullPointerException: the non-null-safe finally { in.close() } would mask the real error if opening the checksum file fails at construction time, which is inconsistent with the rest of the file. No common code path is known to trigger it today; this is a robustness and consistency improvement that removes the latent NPE and aligns the method with its siblings.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Added a unit test in IndexShuffleBlockResolverSuite that forces the stream constructor to fail (a checksum file whose exists() returns true but which cannot be opened) and asserts getChecksums returns null instead of throwing. The test fails on the old code (NPE) and passes with the fix.

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Claude Code (Claude Opus 4.8)

…uffleBlockResolver.getChecksums when the checksum file cannot be opened

### What changes were proposed in this pull request?

`IndexShuffleBlockResolver.getChecksums` opens the checksum file with
`var in: DataInputStream = null` inside a `try` and closes it in
`finally { in.close() }`. The method already catches `IOException`/`EOFException`
to return `null`, so it is meant to degrade gracefully when the file cannot be
read. But the `finally` is not null-safe: if the failure originates in the stream
constructor (before `in` is assigned), it dereferences `null` and throws a
`NullPointerException` that masks the original error.

The two sibling methods in the same file already handle constructor failure
correctly: `checkIndexAndDataFile` constructs the stream in its own `try/catch`,
and `getMergedBlockData` uses `Utils.tryWithResource`. This change makes
`getChecksums` consistent by using `Utils.tryWithResource`, which evaluates the
constructor before its own `try`, so a constructor `IOException` reaches the
existing `catch` and the method returns `null` as intended.

### Why are the changes needed?

`getChecksums` has a latent `NullPointerException`: the non-null-safe
`finally { in.close() }` would mask the real error if opening the checksum file
fails at construction time, which is inconsistent with the rest of the file. No
common code path is known to trigger it today; this is a robustness and
consistency improvement that removes the latent NPE and aligns the method with
its siblings.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Added a unit test in `IndexShuffleBlockResolverSuite` that forces the stream
constructor to fail (a checksum file whose `exists()` returns true but which
cannot be opened) and asserts `getChecksums` returns `null` instead of throwing.
The test fails on the old code (NPE) and passes with the fix.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant