fix(cbz-reader): CBZ reading regression and optimize ZIP parsing strategy by balazs-szucs · Pull Request #2593 · booklore-app/booklore

balazs-szucs · 2026-02-03T13:15:30Z

🚀 Pull Request

📝 Description

Closes: #2589

Fixed a regression in Booklore 1.18.5 where some CBZ files could not be opened due to strict parsing of Unicode Extra Fields in the updated commons-compress library.

🛠️ Changes Implemented

CbxReaderService: If standard ZIP parsing (with Unicode Extra Fields) fails, the system now automatically retries reading the archive without Unicode Extra Fields. This ensures compatibility with archives containing malformed extra fields.
Optimized Encoding Iteration: Refactored the scanning logic to iterate through all parameter combinations (Fast/Slow path × Unicode Enabled/Disabled) per encoding before moving to the next encoding. This is significantly more efficient than the previous strategy of trying all encodings with one setting and then restarting.
CachedArchiveMetadata: to store the successful useUnicodeExtraFields setting along with the encoding, ensuring that subsequent file access (like streaming pages) uses the proven settings immediately.

🧪 Testing Strategy

📸 Visual Changes (if applicable)

⚠️ Required Pre-Submission Checklist

Please Read - This Checklist is Mandatory

Important Notice: We've experienced several production bugs recently due to incomplete pre-submission checks. To maintain code quality and prevent issues from reaching production, we're enforcing stricter adherence to this checklist.

All checkboxes below must be completed before requesting review. PRs that haven't completed these requirements will be sent back for completion.

Mandatory Requirements (please check ALL boxes):

Code adheres to project style guidelines and conventions
Branch synchronized with latest develop branch (please resolve any merge conflicts)
🚨 CRITICAL: Automated unit tests added/updated to cover changes (MANDATORY for ALL Spring Boot backend and Angular frontend changes - this is non-negotiable)
🚨 CRITICAL: All tests pass locally (run ./gradlew test for Spring Boot backend, and ng test for Angular frontend - NO EXCEPTIONS)
🚨 CRITICAL: Manual testing completed in local development environment (verify your changes work AND no existing functionality is broken - test related features thoroughly)
Flyway migration versioning follows correct sequence (if database schema was modified)
Documentation PR submitted to booklore-docs (required for features or enhancements that introduce user-facing or visual changes)

Why This Matters:

Recent production incidents have been traced back to:

Incomplete testing coverage (especially backend)
Merge conflicts not resolved before merge
Missing documentation for new features

Backend changes without tests will not be accepted. By completing this checklist thoroughly, you're helping maintain the quality and stability of Booklore for all users.

Note to Reviewers: Please verify the checklist is complete before beginning your review. If items are unchecked, kindly ask the contributor to complete them first.

💬 Additional Context (optional)

…nd fallback mechanisms Signed-off-by: Balázs Szücs <[email protected]>

Copilot

Pull request overview

Fixes a CBZ reading regression introduced by stricter Unicode Extra Field parsing in commons-compress by adding a fallback ZIP parsing mode (disabling Unicode Extra Fields), optimizing the ZIP strategy iteration order, and caching the successful ZIP settings for subsequent reads/streams.

Changes:

Add ZIP fallback logic to retry without Unicode Extra Fields when standard parsing fails.
Refactor ZIP strategy iteration to try (Fast/Slow × Unicode on/off) per encoding for better efficiency.
Extend cached archive metadata to remember the successful useUnicodeExtraFields setting; add a unit test covering the fallback path.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File	Description
booklore-api/src/main/java/org/booklore/service/reader/CbxReaderService.java	Adds Unicode Extra Fields fallback, strategy iteration refactor, and caches `useUnicodeExtraFields` for streaming.
booklore-api/src/test/java/org/booklore/service/reader/CbxReaderServiceTest.java	Adds a test validating the CBZ fallback behavior when Unicode Extra Fields parsing fails.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-06T20:14:21Z

booklore-api/src/main/java/org/booklore/service/reader/CbxReaderService.java

+        } catch (Exception e) {
+             log.trace("Stream strategy failed ({}, Fast={}, Unicode={}): {}", charset, useFastPath, useUnicode, e.getMessage());


tryStreamEntry catches all Exception, including IOException from reading the ZIP entry or writing to outputStream (e.g., client disconnect). Swallowing those and continuing can lead to misleading errors (eventually “Unable to find entry…”) and potential repeated/partial writes. Consider only catching the ZIP parsing/charset-related exceptions you expect here, and let IOException propagate (or rethrow it).

Suggested change

} catch (Exception e) {

log.trace("Stream strategy failed ({}, Fast={}, Unicode={}): {}", charset, useFastPath, useUnicode, e.getMessage());

} catch (IOException e) {

// I/O problems (e.g., client disconnect, read/write failures) should abort the operation

throw e;

} catch (Exception e) {

// Non-I/O problems (e.g., charset/ZIP parsing issues) mean this strategy failed; try the next one

log.trace("Stream strategy failed ({}, Fast={}, Unicode={}): {}", charset, useFastPath, useUnicode, e.getMessage());

Copilot · 2026-02-06T20:14:22Z

booklore-api/src/main/java/org/booklore/service/reader/CbxReaderService.java

+                    }
+
+        for (Charset charset : ENCODINGS_TO_TRY) {
+            if (charset.equals(encoding)) continue; // Skip failed cached encoding if we want, or retry it with different flags? 


The inline comment on skipping the cached encoding (“Skip failed cached encoding if we want, or retry it with different flags?”) reads like an unresolved design note. Please either remove it or replace it with a concrete TODO linked to an issue so it doesn’t linger in production code.

Suggested change

if (charset.equals(encoding)) continue; // Skip failed cached encoding if we want, or retry it with different flags?

if (charset.equals(encoding)) continue; // Skip cached encoding when trying alternative charsets

Copilot · 2026-02-06T20:14:22Z

booklore-api/src/main/java/org/booklore/service/reader/CbxReaderService.java

+
        throw new IOException("Unable to read ZIP archive with any supported encoding");
    }


The final throw new IOException("Unable to read ZIP archive with any supported encoding") drops the underlying/root exception that caused all strategies to fail. This makes production failures hard to diagnose (especially since per-strategy logs are trace). Consider keeping the last thrown exception and attaching it as the cause of this IOException (or including its type/message).

Copilot · 2026-02-06T20:14:22Z

booklore-api/src/main/java/org/booklore/service/reader/CbxReaderService.java

+        if (encoding != null) {
+            if (tryStreamEntry(cbxPath, entryName, outputStream, encoding, true, useUnicode)) return;
+            if (tryStreamEntry(cbxPath, entryName, outputStream, encoding, false, useUnicode)) return;
+                    }


Line 290 has a visibly over-indented closing brace (}), which looks accidental and hurts readability (and can trip style checks if enforced). Please fix the indentation to match the surrounding block.

Suggested change

}

}

feat(cbx-reader): enhance ZIP archive handling with Unicode support a…

43218bc

…nd fallback mechanisms Signed-off-by: Balázs Szücs <[email protected]>

balazs-szucs changed the title ~~fix(unzip): CBZ reading regression and optimize ZIP parsing strategy~~ fix(cbz-reader): CBZ reading regression and optimize ZIP parsing strategy Feb 3, 2026

balazs-szucs added 2 commits February 5, 2026 22:23

Merge branch 'develop' into cbx-unicode

82e74a2

Merge branch 'develop' into cbx-unicode

0080d16

acx10 requested a review from Copilot February 6, 2026 20:10

Copilot started reviewing on behalf of acx10 February 6, 2026 20:10 View session

Copilot AI reviewed Feb 6, 2026

View reviewed changes

balazs-szucs added 2 commits February 7, 2026 09:05

Merge branch 'develop' into cbx-unicode

beffcb2

Merge branch 'develop' into cbx-unicode

34c8dd5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(cbz-reader): CBZ reading regression and optimize ZIP parsing strategy#2593

fix(cbz-reader): CBZ reading regression and optimize ZIP parsing strategy#2593
balazs-szucs wants to merge 5 commits intobooklore-app:developfrom
balazs-szucs:cbx-unicode

balazs-szucs commented Feb 3, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 6, 2026

Uh oh!

Copilot AI Feb 6, 2026

Uh oh!

Copilot AI Feb 6, 2026

Uh oh!

Copilot AI Feb 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		} catch (Exception e) {
		log.trace("Stream strategy failed ({}, Fast={}, Unicode={}): {}", charset, useFastPath, useUnicode, e.getMessage());

	if (charset.equals(encoding)) continue; // Skip failed cached encoding if we want, or retry it with different flags?
	if (charset.equals(encoding)) continue; // Skip cached encoding when trying alternative charsets


		throw new IOException("Unable to read ZIP archive with any supported encoding");
		}

Uh oh!

Conversation

balazs-szucs commented Feb 3, 2026

🚀 Pull Request

📝 Description

🛠️ Changes Implemented

🧪 Testing Strategy

📸 Visual Changes (if applicable)

⚠️ Required Pre-Submission Checklist

Please Read - This Checklist is Mandatory

Mandatory Requirements (please check ALL boxes):

Why This Matters:

💬 Additional Context (optional)

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant