fix: preserve CRLF line endings in TextFileHandler.read()#438
fix: preserve CRLF line endings in TextFileHandler.read()#438cogitoworks wants to merge 3 commits into
Conversation
|
CodeAnt AI is reviewing your PR. Thanks for using CodeAnt! 🎉We're free for open-source projects. if you're enjoying it, help us grow by sharing. Share on X · |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughText read logic now detects original line-ending style and trailing-newline presence, preserves CRLF/CR vs LF when returning slices, and propagates EOF metadata; a new test suite verifies preservation across reads, edits, and roundtrip write flows. Changes
Sequence Diagram(s)(Skipped — changes do not introduce a multi-component sequential flow warranting a diagram.) Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
| const buffer = Buffer.alloc(8192); | ||
| const { bytesRead } = await fd.read(buffer, 0, 8192, 0); | ||
| const sample = buffer.toString('utf-8', 0, bytesRead); | ||
| return detectLineEnding(sample); |
There was a problem hiding this comment.
Suggestion: Line-ending detection only inspects the first 8192 bytes, so files whose first newline appears after that window will be misclassified (falling back to platform default). In those cases, CRLF files can still be returned as LF and then written back with the wrong endings. Read until a newline is found (or EOF) instead of a fixed-size sample before deciding the style. [logic error]
Severity Level: Major ⚠️
- ❌ read_file returns LF for some CRLF text files.
- ⚠️ write_file persists LF, losing original CRLF formatting.Steps of Reproduction ✅
1. Start the DesktopCommander MCP server so the `read_file` tool is available (tool
definition for `"read_file"` is registered in `src/server.ts:281-346`, and tool dispatch
routes `"read_file"` calls to `handleReadFile` at `src/server.ts:1370-1373`).
2. On a non-Windows host (so `detectLineEnding` defaults to `'\n'` when no newline is
found, per `src/utils/lineEndingHandler.ts:10-24`), create a text file under an allowed
directory whose contents use CRLF (`'\r\n'`) line endings but whose first newline does not
occur until after 8,192 bytes (e.g., >8KB of characters followed by the first `'\r\n'`).
3. From an MCP client, call the `read_file` tool on that path; the server dispatches to
`handleReadFile` (`src/handlers/filesystem-handlers.ts:72-198`), which calls
`readFile(parsed.path, options)` (`src/tools/filesystem.ts:534-542`), which in turn calls
`readFileFromDisk` for local files (`src/tools/filesystem.ts:392-525`). Inside
`readFileFromDisk`, the code obtains a handler via `getFileHandler(validPath)`
(`src/tools/filesystem.ts:468-470`, `src/utils/files/factory.ts:76-103`) and calls
`handler.read(validPath, { offset, length, ... })`, reaching `TextFileHandler.read`
(`src/utils/files/text.ts:54-71`).
4. In `TextFileHandler.read`, `detectFileLineEnding(filePath)` runs first
(`src/utils/files/text.ts:59-63`), implemented at `src/utils/files/text.ts:149-155` where
only the first 8,192 bytes are read into `sample` and passed to
`detectLineEnding(sample)`. Because the first `'\r\n'` occurs after byte 8,192, `sample`
contains no `'\r'` or `'\n'`, so `detectLineEnding` falls back to the platform default
(`'\n'` on non-Windows, `src/utils/lineEndingHandler.ts:10-24`), misclassifying this CRLF
file as LF-only. The subsequent readline-based reading (`readFileWithSmartPositioning`
delegates to `readFromStartWithReadline` / `readFromEstimatedPosition`,
`src/utils/files/text.ts:233-268, 366-399, 405-471`) strips original `'\r\n'` endings and
rejoins lines with `'\n'`, and because `lineEnding !== '\n'` is false, the normalization
branch is skipped (`src/utils/files/text.ts:64-69`). The `handleReadFile` happy-path
response (`src/handlers/filesystem-handlers.ts:167-183`) therefore returns text whose line
endings are LF-only even though the on-disk file uses CRLF. If the client then writes this
content back via the `write_file` tool, `handleWriteFile`
(`src/handlers/filesystem-handlers.ts:266-305`) calls `writeFile`
(`src/tools/filesystem.ts:592-615`), which delegates to `TextFileHandler.write`
(`src/utils/files/text.ts:74-79`) and persists the LF-only string, permanently converting
the file from CRLF to LF due to the initial mis-detected line-ending style.Fix in Cursor | Fix in VSCode Claude
(Use Cmd/Ctrl + Click for best experience)
Prompt for AI Agent 🤖
This is a comment left during a code review.
**Path:** src/utils/files/text.ts
**Line:** 152:155
**Comment:**
*Logic Error: Line-ending detection only inspects the first 8192 bytes, so files whose first newline appears after that window will be misclassified (falling back to platform default). In those cases, CRLF files can still be returned as LF and then written back with the wrong endings. Read until a newline is found (or EOF) instead of a fixed-size sample before deciding the style.
Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix|
CodeAnt AI finished reviewing your PR. |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (3)
test/test-crlf-preservation.js (2)
1-9: Nit: header says "on Windows" but the suite is cross-platform.The CRLF preservation behavior is verified the same way regardless of host OS (the fixtures explicitly write
\r\n). The header comment is slightly misleading — consider dropping "on Windows" so future readers don't think it's Windows-only or that it gets skipped on Linux/macOS CI.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/test-crlf-preservation.js` around lines 1 - 9, The header comment in test/test-crlf-preservation.js misleadingly says "on Windows" even though the test is cross-platform; update the top comment to remove "on Windows" (e.g., change "Test: read_file and edit_block preserve CRLF line endings on Windows" to "Test: read_file and edit_block preserve CRLF line endings" or otherwise indicate cross-platform behavior) so readers won't assume the test is Windows-only; keep the rest of the explanatory text about the bug and fixtures intact.
181-200: Roundtrip test should assert byte-level equivalence, not just CRLF presence.The current assertions (
includes('\r\n'),includes('Line two MODIFIED'), no bare\n) all pass even though the file's trailing\r\nis dropped on the readline path (see comment ontext.tslines 59-72). For a regression test of issue#97, please assert the exact expected bytes:♻️ Stronger assertion
const rawAfter = await fs.readFile(CRLF_FILE, 'utf8'); + const expected = 'Line one\r\nLine two MODIFIED\r\nLine three\r\nLine four\r\nLine five\r\n'; + assert.strictEqual(rawAfter, expected, 'Roundtrip must preserve CRLF byte-for-byte, including trailing newline'); - assert.ok( - rawAfter.includes('\r\n'), - 'File must still contain CRLF after roundtrip' - ); - assert.ok( - rawAfter.includes('Line two MODIFIED'), - 'Modification should be present' - );If the stricter check fails today, that's a real preservation gap users would hit — the test should surface it rather than mask it.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/test-crlf-preservation.js` around lines 181 - 200, The test must assert byte-level equality instead of string inclusion: stop reading the file with 'utf8' (rawAfter) and instead read raw bytes (Buffer) via fs.readFile(CRLF_FILE) and compare Buffer.equals(...) against an expected Buffer that contains the exact bytes for the modified file (i.e., original CRLF bytes with "Line two MODIFIED" and the trailing '\r\n' preserved). Update the assertions to use Buffer.equals and a constructed expectedBuffer (or Buffer.from(<expected string>, 'utf8') if building from literals) so the test fails if any CRLF is dropped (referencing CRLF_FILE and rawAfter/expectedBuffer).src/utils/files/text.ts (1)
149-159: Optional: fold detection into the existing read pass to avoid double I/O.Every call to
read()now performs an additionalfs.open+ 8KB read on top of the subsequent readline pass, doubling syscalls/I/O for small files on a hot path (the read-file tool is frequently invoked). SincereadFileWithSmartPositioningalready reads from the start in the common case, the line ending could be sniffed from the first chunk it reads (or by reading the full small-file contents once and splitting in-process), eliminating the extra open.Not a blocker — current implementation is correct — but worth considering if read latency matters.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/utils/files/text.ts` around lines 149 - 159, The current detectFileLineEnding opens and reads 8KB separately causing double I/O; instead fold line-ending sniffing into the existing read pass in readFileWithSmartPositioning by reusing the initial buffer/string it already reads (or by making detectFileLineEnding accept a sample string/Buffer). Modify readFileWithSmartPositioning to call detectLineEnding(sample) using the first chunk it obtains (or pass the full small-file contents when read once) and remove the extra fs.open/fs.read in detectFileLineEnding (or deprecate it), ensuring we still return a LineEndingStyle from detectLineEnding and keep fallback behavior unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/utils/files/text.ts`:
- Around line 146-159: detectFileLineEnding currently reads a fixed 8KB buffer
and calls detectLineEnding which can misdetect files whose first line exceeds
that sample; change detectFileLineEnding to read in a loop (using the existing
fd and fd.read) appending chunks until you find a '\r' or '\n' or hit a larger
sensible cap (e.g., 64KB) or EOF, then call detectLineEnding on the accumulated
sample, ensuring fd.close() in finally; alternatively remove this helper and
derive line endings during the existing readline pass to avoid the extra
open/read entirely — update references to detectFileLineEnding and keep
detectLineEnding unchanged.
- Around line 59-72: readFileWithSmartPositioning +
detectFileLineEnding/normalizeLineEndings currently preserve inter-line endings
but lose a trailing terminator because readline strips the final line ending and
result is built with result.join('\n'); fix by detecting whether the original
file ended with a newline (e.g., read the last byte of the file or use fs.read
to inspect the final byte in the same code path that calls detectFileLineEnding)
and, after you normalize inter-line endings (normalizeLineEndings) and before
returning from the public read method, if the file originally ended with a
newline append the detected lineEnding to result.content so trailing CRLF/CR is
restored on read→modify→write roundtrips.
In `@test/test-crlf-preservation.js`:
- Around line 41-45: Teardown's call to configManager.updateConfig performs a
shallow merge so it won't remove keys like allowedDirectories; modify teardown
to explicitly restore the prior allowedDirectories value by passing an update
that sets allowedDirectories to originalConfig.allowedDirectories (or
null/undefined if it was absent) when calling configManager.updateConfig, or
replace the entire config via a method that overwrites instead of merges
(reference teardown, configManager.updateConfig, allowedDirectories, and
TEST_DIR).
---
Nitpick comments:
In `@src/utils/files/text.ts`:
- Around line 149-159: The current detectFileLineEnding opens and reads 8KB
separately causing double I/O; instead fold line-ending sniffing into the
existing read pass in readFileWithSmartPositioning by reusing the initial
buffer/string it already reads (or by making detectFileLineEnding accept a
sample string/Buffer). Modify readFileWithSmartPositioning to call
detectLineEnding(sample) using the first chunk it obtains (or pass the full
small-file contents when read once) and remove the extra fs.open/fs.read in
detectFileLineEnding (or deprecate it), ensuring we still return a
LineEndingStyle from detectLineEnding and keep fallback behavior unchanged.
In `@test/test-crlf-preservation.js`:
- Around line 1-9: The header comment in test/test-crlf-preservation.js
misleadingly says "on Windows" even though the test is cross-platform; update
the top comment to remove "on Windows" (e.g., change "Test: read_file and
edit_block preserve CRLF line endings on Windows" to "Test: read_file and
edit_block preserve CRLF line endings" or otherwise indicate cross-platform
behavior) so readers won't assume the test is Windows-only; keep the rest of the
explanatory text about the bug and fixtures intact.
- Around line 181-200: The test must assert byte-level equality instead of
string inclusion: stop reading the file with 'utf8' (rawAfter) and instead read
raw bytes (Buffer) via fs.readFile(CRLF_FILE) and compare Buffer.equals(...)
against an expected Buffer that contains the exact bytes for the modified file
(i.e., original CRLF bytes with "Line two MODIFIED" and the trailing '\r\n'
preserved). Update the assertions to use Buffer.equals and a constructed
expectedBuffer (or Buffer.from(<expected string>, 'utf8') if building from
literals) so the test fails if any CRLF is dropped (referencing CRLF_FILE and
rawAfter/expectedBuffer).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 9f9dbec0-d93d-41e6-9a9f-311fb6833260
📒 Files selected for processing (2)
src/utils/files/text.tstest/test-crlf-preservation.js
9975cbf to
0486a76
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/utils/files/text.ts`:
- Around line 67-75: The patch must ensure we append the original lineEnding
when the read slice did not reach EOF, not just when the whole file lacks a
trailing newline: modify readFileWithSmartPositioning to return (or compute) a
reachedEOF / isLastSlice boolean (or expose totalLines so callers can compute
readLines === totalLines - offset), then in the code that currently checks
endsWithNewline use that new flag — if the slice did NOT reach EOF, always
append lineEnding to result.content (fall back to the existing endsWithNewline
check when reachedEOF is true); update references to normalizeLineEndings,
result.content and the trailing-newline logic accordingly so partial reads
preserve the terminator from the source.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: fb26329f-1ea6-43b6-b30c-e473bf6c34e6
📒 Files selected for processing (2)
src/utils/files/text.tstest/test-crlf-preservation.js
🚧 Files skipped from review as they are similar to previous changes (1)
- test/test-crlf-preservation.js
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
src/utils/files/text.ts (1)
73-75:⚠️ Potential issue | 🟡 MinorPartial reads still drop the trailing terminator when the source file has no final newline.
The gate here is "did the file end with a newline?", but for a slice that stops short of EOF every returned line had a terminator in the source, so the last line of the slice still needs
lineEndingappended. e.g. CRLF source"line1\r\nline2\r\nline3"(no trailing newline) read withlength=2returns"line1\r\nline2"and silently fusesline2withline3on a read→modify→write roundtrip. The newtest-crlf-preservation.jscases only cover full-file reads, so this isn't caught.A correct gate is "did the read reach EOF?" — e.g., thread a
reachedEOFflag out ofreadFileWithSmartPositioning(or compareresult.lengthagainsttotalLines - offset) and appendlineEndingwhenever!reachedEOF, falling back to the currentendsWithNewlinecheck only whenreachedEOFis true.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/utils/files/text.ts` around lines 73 - 75, The current check in src/utils/files/text.ts that appends lineEnding only when endsWithNewline drops terminators for partial reads; change the gate so you append lineEnding whenever the read did NOT reach EOF (instead of relying on the original file's endsWithNewline). Thread a reachedEOF flag out of readFileWithSmartPositioning (or compute reachedEOF by comparing result.length to totalLines - offset) and update the condition around result.content/lineEnding to append when !reachedEOF, falling back to the existing endsWithNewline check only when reachedEOF is true; reference symbols: readFileWithSmartPositioning, reachedEOF/result.length, totalLines, offset, endsWithNewline, result.content, lineEnding.
🧹 Nitpick comments (1)
src/utils/files/text.ts (1)
59-65: Optional: collapse the three separate I/O passes into one.Each
read()now doesdetectFileLineEnding(open + chunked reads + close) +fileEndsWithNewline(stat + open + 1-byte read + close) on top ofreadFileWithSmartPositioning, which itself stats and (for files under 10MB) callsgetFileLineCountthat reads the entire file viafs.readFile. For the common <10MB path you're already loading the whole file into memory, so detecting line ending and trailing-newline from that buffer would eliminate two extra opens and the redundant final-byte read. For larger files, detecting the line ending during the existingreadlinepass (and surfacing areachedEOFflag, which would also resolve the partial-read concern above) avoids the standalone scan entirely.Not a correctness issue and fine to defer, but worth considering before this becomes a hot path.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/utils/files/text.ts` around lines 59 - 65, The current read flow makes three separate I/O passes: detectFileLineEnding + fileEndsWithNewline + readFileWithSmartPositioning; instead, modify readFileWithSmartPositioning to return the file buffer/metadata needed for line-ending and trailing-newline detection (for the <10MB fast-path reuse the loaded buffer to compute lineEnding and endsWithNewline and avoid calling detectFileLineEnding/fileEndsWithNewline), and for large files detect the line ending during the existing readline pass and expose a reachedEOF flag from readFileWithSmartPositioning so callers don’t perform a separate final-byte read; update callers to use the returned lineEnding/endsWithNewline/reachedEOF values and remove redundant opens/stats (functions referenced: read(), detectFileLineEnding, fileEndsWithNewline, readFileWithSmartPositioning, getFileLineCount).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/utils/files/text.ts`:
- Around line 164-190: The scan loop can drop the prevEndedWithCR signal when a
'\r' falls exactly at MAX_SCAN boundary; to fix, preserve that state after the
loop by checking prevEndedWithCR before returning the default LF and return '\r'
(or '\r\n' after optionally reading one more byte) instead of falling through to
'\n'. Update the logic around the while (position < MAX_SCAN) loop (variables:
prevEndedWithCR, position, MAX_SCAN, fd.read, CHUNK_SIZE) so that if
prevEndedWithCR is true when the loop exits you return the correct CR/CRLF
result rather than defaulting to '\n' (or allow one additional read iteration
when prevEndedWithCR is true).
---
Duplicate comments:
In `@src/utils/files/text.ts`:
- Around line 73-75: The current check in src/utils/files/text.ts that appends
lineEnding only when endsWithNewline drops terminators for partial reads; change
the gate so you append lineEnding whenever the read did NOT reach EOF (instead
of relying on the original file's endsWithNewline). Thread a reachedEOF flag out
of readFileWithSmartPositioning (or compute reachedEOF by comparing
result.length to totalLines - offset) and update the condition around
result.content/lineEnding to append when !reachedEOF, falling back to the
existing endsWithNewline check only when reachedEOF is true; reference symbols:
readFileWithSmartPositioning, reachedEOF/result.length, totalLines, offset,
endsWithNewline, result.content, lineEnding.
---
Nitpick comments:
In `@src/utils/files/text.ts`:
- Around line 59-65: The current read flow makes three separate I/O passes:
detectFileLineEnding + fileEndsWithNewline + readFileWithSmartPositioning;
instead, modify readFileWithSmartPositioning to return the file buffer/metadata
needed for line-ending and trailing-newline detection (for the <10MB fast-path
reuse the loaded buffer to compute lineEnding and endsWithNewline and avoid
calling detectFileLineEnding/fileEndsWithNewline), and for large files detect
the line ending during the existing readline pass and expose a reachedEOF flag
from readFileWithSmartPositioning so callers don’t perform a separate final-byte
read; update callers to use the returned lineEnding/endsWithNewline/reachedEOF
values and remove redundant opens/stats (functions referenced: read(),
detectFileLineEnding, fileEndsWithNewline, readFileWithSmartPositioning,
getFileLineCount).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 5380edfd-a70f-4a44-8807-d04eed2afd0a
📒 Files selected for processing (2)
src/utils/files/text.tstest/test-crlf-preservation.js
🚧 Files skipped from review as they are similar to previous changes (1)
- test/test-crlf-preservation.js
Add reachedEOF flag to readFileWithSmartPositioning return metadata. When a partial read does not reach EOF, the last line read definitely had a newline after it (more content follows), so always restore it. Only use the endsWithNewline heuristic when the read actually reached the end of the file. Fixes edge case identified in CodeRabbit review on PR wonderwhy-er#438.
There was a problem hiding this comment.
🧹 Nitpick comments (2)
test/test-crlf-preservation.js (1)
35-46: Setup can leak the temp directory on partial failure.If
fs.mkdirsucceeds but a later step insetup()(e.g., awriteFile, orconfigManager.getConfig()) throws,originalConfigis never returned, thefinallyblock inrunTestsskipsteardown(), andtest_crlf_preservation/is left on disk for the next run. Minor, but easy to harden by reordering: captureoriginalConfigfirst, then create the directory inside a try/catch that cleans up on failure — or always invoke teardown'sfs.rmregardless oforiginalConfig.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/test-crlf-preservation.js` around lines 35 - 46, The setup() helper can leak TEST_DIR if fs.mkdir succeeds but later operations (fs.writeFile or configManager.getConfig()) throw, because originalConfig isn't captured and teardown() may be skipped by runTests; fix by capturing originalConfig before creating files (call await configManager.getConfig() first), or ensure cleanup always runs on partial failure by wrapping the file-creation/config calls in a try/catch that on any exception removes TEST_DIR (fs.rm) and rethrows, and/or always call teardown() regardless of originalConfig; update setup(), and adjust runTests/teardown() usage accordingly to guarantee TEST_DIR is removed on failure.src/utils/files/text.ts (1)
59-83: Trailing-newline + EOF gating logic looks correct.The combination of
reachedEOF(default-true via!== false) withendsWithNewlinecorrectly handles all four cases: (a) full read of file with trailing newline → restore, (b) full read of file without trailing newline → don't add, (c) partial read mid-file → always restore, (d) empty file → no-op. Nice resolution of the partial-read edge case raised earlier.One optional micro-refactor:
detectFileLineEndingandfileEndsWithNewlineare independent and could run in parallel viaPromise.allto shave one extrafs.openround-trip from the hot path.♻️ Optional parallelization
- const lineEnding = await this.detectFileLineEnding(filePath); - const endsWithNewline = await this.fileEndsWithNewline(filePath); + const [lineEnding, endsWithNewline] = await Promise.all([ + this.detectFileLineEnding(filePath), + this.fileEndsWithNewline(filePath), + ]);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/utils/files/text.ts` around lines 59 - 83, Run detectFileLineEnding(filePath) and fileEndsWithNewline(filePath) in parallel to avoid an extra fs open: replace the sequential awaits that assign lineEnding and endsWithNewline with a Promise.all call that destructures the two results, keeping the existing logic that uses lineEnding and endsWithNewline downstream (the surrounding function that calls detectFileLineEnding and fileEndsWithNewline should only change how those two promises are awaited).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/utils/files/text.ts`:
- Around line 59-83: Run detectFileLineEnding(filePath) and
fileEndsWithNewline(filePath) in parallel to avoid an extra fs open: replace the
sequential awaits that assign lineEnding and endsWithNewline with a Promise.all
call that destructures the two results, keeping the existing logic that uses
lineEnding and endsWithNewline downstream (the surrounding function that calls
detectFileLineEnding and fileEndsWithNewline should only change how those two
promises are awaited).
In `@test/test-crlf-preservation.js`:
- Around line 35-46: The setup() helper can leak TEST_DIR if fs.mkdir succeeds
but later operations (fs.writeFile or configManager.getConfig()) throw, because
originalConfig isn't captured and teardown() may be skipped by runTests; fix by
capturing originalConfig before creating files (call await
configManager.getConfig() first), or ensure cleanup always runs on partial
failure by wrapping the file-creation/config calls in a try/catch that on any
exception removes TEST_DIR (fs.rm) and rethrows, and/or always call teardown()
regardless of originalConfig; update setup(), and adjust runTests/teardown()
usage accordingly to guarantee TEST_DIR is removed on failure.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 87bbd607-6640-4e9d-b463-4c7592dd31ce
📒 Files selected for processing (3)
src/utils/files/base.tssrc/utils/files/text.tstest/test-crlf-preservation.js
✅ Files skipped from review due to trivial changes (1)
- src/utils/files/base.ts
TextFileHandler.read() uses readline which strips line endings, then joins with \n, silently converting CRLF to LF. When an AI client writes content back via write_file, CRLF files lose their original line endings. - Add detectFileLineEnding() to sample first 8KB and detect line style - Restore original line endings after readline processing - Leverage existing lineEndingHandler utilities Includes 5 new tests for CRLF preservation. Relates to wonderwhy-er#97
- detectFileLineEnding: loop 8KB chunks up to 64KB instead of fixed single-buffer read, handling \r at chunk boundaries correctly - read(): restore trailing newline stripped by readline by checking the file's final byte - test teardown: use setValue() to explicitly restore allowedDirectories instead of shallow-merging the full config
Add reachedEOF flag to readFileWithSmartPositioning return metadata. When a partial read does not reach EOF, the last line read definitely had a newline after it (more content follows), so always restore it. Only use the endsWithNewline heuristic when the read actually reached the end of the file. Fixes edge case identified in CodeRabbit review on PR wonderwhy-er#438.
dfd8ba6 to
ead484f
Compare
|
CodeAnt AI is running Incremental review Thanks for using CodeAnt! 🎉We're free for open-source projects. if you're enjoying it, help us grow by sharing. Share on X · |
|
CodeAnt AI Incremental review completed. |
There was a problem hiding this comment.
♻️ Duplicate comments (1)
src/utils/files/text.ts (1)
158-200:⚠️ Potential issue | 🟡 Minor
detectFileLineEndingstill dropsprevEndedWithCRat the MAX_SCAN boundary.When
\rlands as the very last byte of the 64KB cap (file offset 65535), the inner loop setsprevEndedWithCR = trueand breaks;position += bytesReadthen makesposition === MAX_SCAN, thewhilecondition fails, and execution falls through toreturn '\n'at Line 196 — even though a\rwas definitively observed. Still a narrow edge case, but trivially fixable by preserving the signal in the fall-through return.🛡️ Proposed fix
} // No newline found within cap — default to LF - return '\n'; + // If we saw a CR right at the boundary, treat as CR rather than LF. + return prevEndedWithCR ? '\r' : '\n'; } finally { await fd.close(); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/utils/files/text.ts` around lines 158 - 200, The detectFileLineEnding function currently loses the prevEndedWithCR signal when the scan hits the MAX_SCAN cap and falls through to return '\n'; modify the fall-through logic so that if prevEndedWithCR is true at loop exit it returns '\r' (preserving the observed CR at the final byte) instead of defaulting to LF; update the end-of-function return near the MAX_SCAN fallback to check prevEndedWithCR before returning '\n' (references: detectFileLineEnding, prevEndedWithCR, MAX_SCAN, CHUNK_SIZE).
🧹 Nitpick comments (2)
test/test-crlf-preservation.js (1)
28-33: Optional: add CR-only and mixed-ending fixtures.
detectFileLineEndingdistinguishes\r,\n, and\r\n, and the PR objectives mention "mixed" file cases, but the suite only exercises CRLF and LF. A small fixture for CR-only (classic Mac) endings and one for mixed CRLF/LF would catch regressions indetectFileLineEnding's chunk-boundary logic and document the file's normalization-to-first-detected-ending behavior. Not blocking.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/test-crlf-preservation.js` around lines 28 - 33, Add two new test fixtures to exercise CR-only and mixed line endings so detectFileLineEnding's handling of '\r' and mixed chunk boundaries is covered: define a CR_ONLY_CONTENT (e.g., five lines each ending with '\r') and a MIXED_CONTENT (e.g., some lines with '\r\n' and some with '\n' interleaved, and versions with and without trailing newline similar to CRLF_CONTENT/CRLF_NO_TRAILING_CONTENT). Update test/test-crlf-preservation.js to include these constants and add corresponding assertions that the code path using detectFileLineEnding and any normalization-to-first-detected-ending behavior yields the expected detected ending and normalized output.src/utils/files/text.ts (1)
202-217: Consider foldingfileEndsWithNewlineanddetectFileLineEndinginto a single pass.
read()currently opens the file three times (detect line ending, detect trailing newline, thenreadFileWithSmartPositioning).fileEndsWithNewlinealready does its ownfs.stat+fs.openjust to read one byte. You could either:
- Combine the two detection helpers into one function that opens the fd once, reads the head chunk(s) for line-ending detection and a single byte at
size - 1for trailing-newline detection, then closes the fd; or- Run them in parallel via
Promise.all([detectFileLineEnding(...), fileEndsWithNewline(...)])for a quick win.Not a correctness issue — just unnecessary I/O on every read.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/utils/files/text.ts` around lines 202 - 217, The current implementation performs redundant I/O by opening the file multiple times to detect line endings and trailing newline; combine detectFileLineEnding and fileEndsWithNewline into a single helper (called from read/readFileWithSmartPositioning) that opens the file once, uses fs.stat once, reads the head chunk(s) to determine the line ending and then reads a single byte at stats.size - 1 to determine trailing newline, returns both results (e.g., { lineEnding, endsWithNewline }), and ensure callers (read/readFileWithSmartPositioning) consume the combined result; alternatively, if you prefer minimal change, call Promise.all([detectFileLineEnding(path), fileEndsWithNewline(path)]) to run them in parallel so the file stats/reads don't block sequentially.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@src/utils/files/text.ts`:
- Around line 158-200: The detectFileLineEnding function currently loses the
prevEndedWithCR signal when the scan hits the MAX_SCAN cap and falls through to
return '\n'; modify the fall-through logic so that if prevEndedWithCR is true at
loop exit it returns '\r' (preserving the observed CR at the final byte) instead
of defaulting to LF; update the end-of-function return near the MAX_SCAN
fallback to check prevEndedWithCR before returning '\n' (references:
detectFileLineEnding, prevEndedWithCR, MAX_SCAN, CHUNK_SIZE).
---
Nitpick comments:
In `@src/utils/files/text.ts`:
- Around line 202-217: The current implementation performs redundant I/O by
opening the file multiple times to detect line endings and trailing newline;
combine detectFileLineEnding and fileEndsWithNewline into a single helper
(called from read/readFileWithSmartPositioning) that opens the file once, uses
fs.stat once, reads the head chunk(s) to determine the line ending and then
reads a single byte at stats.size - 1 to determine trailing newline, returns
both results (e.g., { lineEnding, endsWithNewline }), and ensure callers
(read/readFileWithSmartPositioning) consume the combined result; alternatively,
if you prefer minimal change, call Promise.all([detectFileLineEnding(path),
fileEndsWithNewline(path)]) to run them in parallel so the file stats/reads
don't block sequentially.
In `@test/test-crlf-preservation.js`:
- Around line 28-33: Add two new test fixtures to exercise CR-only and mixed
line endings so detectFileLineEnding's handling of '\r' and mixed chunk
boundaries is covered: define a CR_ONLY_CONTENT (e.g., five lines each ending
with '\r') and a MIXED_CONTENT (e.g., some lines with '\r\n' and some with '\n'
interleaved, and versions with and without trailing newline similar to
CRLF_CONTENT/CRLF_NO_TRAILING_CONTENT). Update test/test-crlf-preservation.js to
include these constants and add corresponding assertions that the code path
using detectFileLineEnding and any normalization-to-first-detected-ending
behavior yields the expected detected ending and normalized output.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 1a62e2a0-c00a-44c6-a149-306806a877ce
📒 Files selected for processing (3)
src/utils/files/base.tssrc/utils/files/text.tstest/test-crlf-preservation.js
✅ Files skipped from review due to trivial changes (1)
- src/utils/files/base.ts
|
CodeAnt AI is running the review. Thanks for using CodeAnt! 🎉We're free for open-source projects. if you're enjoying it, help us grow by sharing. Share on X · |
Sequence DiagramThis PR updates the text file read path so that read_file returns content with the original line ending style and trailing newline behavior preserved, even for partial reads, enabling byte-stable read-edit-write cycles. sequenceDiagram
participant Client
participant FilesystemAPI
participant TextFileHandler
participant FileSystem
Client->>FilesystemAPI: read file request
FilesystemAPI->>TextFileHandler: read text with options
TextFileHandler->>FileSystem: Inspect file for line ending and trailing newline
TextFileHandler->>FileSystem: Read requested lines and track end of file
TextFileHandler->>TextFileHandler: Restore original line endings and trailing newline
TextFileHandler-->>FilesystemAPI: Return normalized text content
FilesystemAPI-->>Client: Return file content with preserved line endings
Generated by CodeAnt AI |
|
CodeAnt AI is running the review. Thanks for using CodeAnt! 🎉We're free for open-source projects. if you're enjoying it, help us grow by sharing. Share on X · |
Sequence DiagramThis PR updates the text file read path so that read_file returns content with the file's original line ending style and trailing newline behavior, using a new reachedEOF flag to distinguish full vs partial reads. sequenceDiagram
participant Client
participant FilesystemAPI
participant TextFileHandler
participant Disk
Client->>FilesystemAPI: read_file(path, offset, length)
FilesystemAPI->>TextFileHandler: read(path, options)
TextFileHandler->>Disk: Inspect file to detect line endings and trailing newline
TextFileHandler->>Disk: Stream lines and determine reachedEOF
TextFileHandler->>TextFileHandler: Normalize text to original line ending style
TextFileHandler->>TextFileHandler: Decide trailing newline based on reachedEOF and original file
TextFileHandler-->>FilesystemAPI: Return text and reachedEOF metadata
FilesystemAPI-->>Client: Deliver content with original line endings preserved
Generated by CodeAnt AI |
| const MAX_SCAN = 65536; // 64KB cap | ||
| const fd = await fs.open(filePath, 'r'); | ||
| try { | ||
| const buffer = Buffer.alloc(CHUNK_SIZE); | ||
| let position = 0; | ||
| let prevEndedWithCR = false; | ||
| while (position < MAX_SCAN) { | ||
| const { bytesRead } = await fd.read(buffer, 0, CHUNK_SIZE, position); | ||
| if (bytesRead === 0) { | ||
| return prevEndedWithCR ? '\r' : '\n'; | ||
| } | ||
| const chunk = buffer.toString('utf-8', 0, bytesRead); | ||
| // Handle \r at previous chunk boundary followed by \n here | ||
| if (prevEndedWithCR) { | ||
| return chunk[0] === '\n' ? '\r\n' : '\r'; | ||
| } | ||
| for (let i = 0; i < chunk.length; i++) { | ||
| if (chunk[i] === '\r') { | ||
| if (i + 1 < chunk.length) { | ||
| return chunk[i + 1] === '\n' ? '\r\n' : '\r'; | ||
| } | ||
| // \r at end of chunk — check next chunk for \n | ||
| prevEndedWithCR = true; | ||
| break; | ||
| } | ||
| if (chunk[i] === '\n') { | ||
| return '\n'; | ||
| } | ||
| } | ||
| position += bytesRead; | ||
| } | ||
| // No newline found within cap — default to LF | ||
| return '\n'; |
There was a problem hiding this comment.
Suggestion: The line-ending detector stops scanning after 64KB and defaults to LF if no newline is found in that window. For files whose first newline appears after that limit (for example, very long first lines), CRLF/CR files will be misdetected as LF, so returned content will still be normalized to LF and the original endings will be lost. Continue scanning until the first newline or EOF (or make the cap large/adaptive and fall back to a safer strategy) instead of forcing LF at 64KB. [logic error]
Severity Level: Major ⚠️
- ❌ readFile misidentifies CRLF style for long first lines.
- ⚠️ writeFile roundtrips collapse CRLF to LF for files.
- ⚠️ Git diffs show spurious line-ending changes after edits.Steps of Reproduction ✅
1. Create a CRLF text file whose first newline appears after 64KB (e.g., a single >64KB
line followed by `\r\n` and more lines) inside an allowed directory (see
allowedDirectories setup pattern in `test/test-crlf-preservation.js:35-44`).
2. Call the public `readFile()` API in `src/tools/filesystem.ts:85-93` with that file path
(this routes to `readFileFromDisk()` which calls `getFileHandler()` and then
`handler.read()` at `filesystem.ts:467-477`).
3. In `src/utils/files/factory.ts:76-103`, `getFileHandler()` returns `TextFileHandler`
for this text file, and `TextFileHandler.read()` at `src/utils/files/text.ts:54-83`
invokes `detectFileLineEnding(filePath)` at `text.ts:162-196` before reading.
4. Inside `detectFileLineEnding()`, the loop scans in 8KB chunks while `position <
MAX_SCAN` (`text.ts:170-194`); because the first `\r`/`\n` occurs after 64KB, the loop
exits without finding any newline and returns `'\n'` at `text.ts:195-196`, causing
`lineEnding` to be misdetected as LF so the normalization branch at `text.ts:67-70` is
skipped and the CRLF file is read and later rewritten (via `writeFile()` at
`src/tools/filesystem.ts:143-165`) with LF-only line endings.Fix in Cursor | Fix in VSCode Claude
(Use Cmd/Ctrl + Click for best experience)
Prompt for AI Agent 🤖
This is a comment left during a code review.
**Path:** src/utils/files/text.ts
**Line:** 164:196
**Comment:**
*Logic Error: The line-ending detector stops scanning after 64KB and defaults to LF if no newline is found in that window. For files whose first newline appears after that limit (for example, very long first lines), CRLF/CR files will be misdetected as LF, so returned content will still be normalized to LF and the original endings will be lost. Continue scanning until the first newline or EOF (or make the cap large/adaptive and fall back to a safer strategy) instead of forcing LF at 64KB.
Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix|
CodeAnt AI finished running the review. Thanks for using CodeAnt! 🎉We're free for open-source projects. if you're enjoying it, help us grow by sharing. Share on X · |
| if (lineEnding !== '\n' && typeof result.content === 'string') { | ||
| result.content = normalizeLineEndings(result.content, lineEnding); |
There was a problem hiding this comment.
Suggestion: This normalizes the entire returned payload to a single line-ending style, which rewrites files that contain mixed line endings. In a read→write roundtrip, mixed-ending files will be silently homogenized instead of preserved, causing unintended file diffs and content drift. Preserve per-line endings (or skip normalization when mixed endings are detected) instead of forcing one style. [logic error]
Severity Level: Major ⚠️
- ⚠️ read_file+write_file roundtrips rewrite mixed-ending text files.
- ⚠️ Version-controlled files gain noisy diffs from newline homogenization.
- ⚠️ Byte-stable guarantee in description fails for mixed endings.Steps of Reproduction ✅
1. Create a text file on disk with mixed line endings, e.g. content `"Line one\r\nLine
two\nLine three\r\n"` saved as `mixed.txt`.
2. Call the MCP `read_file` tool via the filesystem handler `handleReadFile` in
`src/handlers/filesystem-handlers.ts:108-88`, which delegates to `readFile` in
`src/tools/filesystem.ts:95-103`, then to `readFileFromDisk` in
`src/tools/filesystem.ts:392-404`.
3. `readFileFromDisk` obtains a `TextFileHandler` from the factory
(`src/utils/files/factory.ts:38-40`) and calls `TextFileHandler.read()`
(`src/utils/files/text.ts:54-83`). Inside `read()`, `detectFileLineEnding()`
(`src/utils/files/text.ts:162-197`) returns the style of the *first* newline encountered
(e.g. `'\r\n'` if the first line uses CRLF). Then the existing code at
`src/utils/files/text.ts:68-69` runs `normalizeLineEndings(result.content, lineEnding)`,
which, per `normalizeLineEndings` in `src/utils/lineEndingHandler.ts:30-41`, rewrites
*all* line endings in `result.content` to that one style (e.g. all `\n` become `\r\n`).
4. If an AI client then performs a read→modify→write roundtrip like test 5 in
`test/test-crlf-preservation.js:103-146` but using `mixed.txt` (read via `handleReadFile`,
strip the status header by `indexOf('Line one')`, modify content, then write back via
`writeFile` in `src/tools/filesystem.ts:153-160`), the rewritten content written to disk
will now have uniform line endings (e.g. all CRLF) instead of the original mixed CRLF/LF
combination, causing silent normalization and file diffs.Fix in Cursor | Fix in VSCode Claude
(Use Cmd/Ctrl + Click for best experience)
Prompt for AI Agent 🤖
This is a comment left during a code review.
**Path:** src/utils/files/text.ts
**Line:** 68:69
**Comment:**
*Logic Error: This normalizes the entire returned payload to a single line-ending style, which rewrites files that contain mixed line endings. In a read→write roundtrip, mixed-ending files will be silently homogenized instead of preserved, causing unintended file diffs and content drift. Preserve per-line endings (or skip normalization when mixed endings are detected) instead of forcing one style.
Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix|
CodeAnt AI finished running the review. Thanks for using CodeAnt! 🎉We're free for open-source projects. if you're enjoying it, help us grow by sharing. Share on X · |
User description
Problem
TextFileHandler.read()uses Node'sreadlineinterface, which strips line endings, then joins with\n. This silently converts CRLF content to LF in the returned string. When an AI client writes this content back viawrite_file, the original CRLF endings are lost.Note:
readFileInternal()(used byedit_block) reads viafs.readFiledirectly and is not affected.Fix
detectFileLineEnding()— samples the first 8KB to detect the file's line ending stylereadlineprocessing, restore original line endings using the existingnormalizeLineEndings()utilityTests
test/test-crlf-preservation.jscovering CRLF, LF, mixed, and empty file casestest-edit-block-line-endings.jstests pass (no regression)Relates to #97
Summary by CodeRabbit
Bug Fixes
Tests
Chores
CodeAnt-AI Description
Preserve original line endings and trailing newlines when reading text files
What Changed
Impact
✅ Fewer phantom diffs in Windows text files✅ Safer read-edit-write round trips✅ Correct line endings in partial file previews🔄 Retrigger CodeAnt AI Review
Details
💡 Usage Guide
Checking Your Pull Request
Every time you make a pull request, our system automatically looks through it. We check for security issues, mistakes in how you're setting up your infrastructure, and common code problems. We do this to make sure your changes are solid and won't cause any trouble later.
Talking to CodeAnt AI
Got a question or need a hand with something in your pull request? You can easily get in touch with CodeAnt AI right here. Just type the following in a comment on your pull request, and replace "Your question here" with whatever you want to ask:
This lets you have a chat with CodeAnt AI about your pull request, making it easier to understand and improve your code.
Example
Preserve Org Learnings with CodeAnt
You can record team preferences so CodeAnt AI applies them in future reviews. Reply directly to the specific CodeAnt AI suggestion (in the same thread) and replace "Your feedback here" with your input:
This helps CodeAnt AI learn and adapt to your team's coding style and standards.
Example
Retrigger review
Ask CodeAnt AI to review the PR again, by typing:
Check Your Repository Health
To analyze the health of your code repository, visit our dashboard at https://app.codeant.ai. This tool helps you identify potential issues and areas for improvement in your codebase, ensuring your repository maintains high standards of code health.