Skip to content

Make 'output may be large' note size-conditional + extend to rar2john / dmg2john / keepass2john (#4051)#5982

Open
ChrisJr404 wants to merge 1 commit intoopenwall:bleeding-jumbofrom
ChrisJr404:feat-2john-large-output-warnings
Open

Make 'output may be large' note size-conditional + extend to rar2john / dmg2john / keepass2john (#4051)#5982
ChrisJr404 wants to merge 1 commit intoopenwall:bleeding-jumbofrom
ChrisJr404:feat-2john-large-output-warnings

Conversation

@ChrisJr404
Copy link
Copy Markdown

Closes (or substantially closes) #4051.

The `*2john` family of converters embed the encrypted blob from the input archive directly into the JtR hash line written to stdout. For large inputs (encrypted disk images, password-protected RARs, KeePass DBs with big key files) the resulting hash line can run into hundreds of megabytes — which routinely surprises new users into thinking the tool malfunctioned.

@gartikis added an unconditional one-liner to zip2john in #5837 to address that. It helped, but:

  1. It fires after each output, so by the time it appears the user has already watched MBs scroll by (and sometimes Ctrl+C'd).
  2. It fires for every archive, including tiny ones — `zip2john tiny.zip` would always print "It is normal for some outputs to be very large" even when the output was 200 bytes.
  3. The other converters most likely to surprise users (rar2john, dmg2john, keepass2john) didn't get the same hint.

This PR replaces the unconditional zip2john message with a size-conditional version and extends it to the other three converters, sharing one tiny static-inline helper.

What changes

`src/2john_common.h` (new)

Single header, no link-time dep:

```c
LARGE_OUTPUT_THRESHOLD_BYTES 1 MiB, matching the suggestion in #4051
large_output_note(progname) one-shot stderr note ('output may be very
large; redirect into a file with...')
large_output_note_if_input_large(progname, path, threshold)
stat(path); fire note iff size >= threshold;
stat() failures silently ignored — the note
is UX, not correctness
```

`large_output_note` uses a static `announced` flag so feeding many archives in one invocation only prints it once.

`src/zip2john.c`

  • Includes `2john_common.h`.
  • Calls the helper at the top of each per-archive iteration in `zip2john()`.
  • Removes the previous `fprintf(stderr, "Note: It is normal for some outputs to be very large\n");` from `process_one()` (replaced by an inline comment pointing at the new helper) so users scanning many small archives no longer see the note 100x.

`src/rar2john.c` / `src/dmg2john.c` / `src/keepass2john.c`

Same pattern: include the header, call the helper before the per-file processing function.

Live verification

Built with `./configure && make` from `src/`. Only warning is a pre-existing `-Wcpp` one about libbz2 unrelated to this change.

```
$ ./run/zip2john tiny.zip # 207 B input
ver 1.0 efh 5455 efh 7875 tiny.zip/small.txt PKZIP Encr: 2b chk, ...
tiny.zip/small.txt:$pkzip$122017b467677b60430178ab05da8b9...
# NO note — was previously printed unconditionally

$ ./run/zip2john big.zip 2>&1 1>/dev/null # ~2 MiB input
Note: zip2john output can be very large for large inputs (often 2x the
input size or more, since the encrypted blob is hex-encoded into the
hash line). This is normal — redirect the output into a file with
'zip2john > hashes.txt'.

$ ./run/keepass2john big.kdbx 2>&1 | head -5 # ~2 MiB input
Note: keepass2john output can be very large for large inputs (often 2x the
input size or more, since the encrypted blob is hex-encoded into the
hash line). This is normal — redirect the output into a file with
'keepass2john > hashes.txt'.
! big.kdbx : Unknown format: File signature invalid
```

Notes for review

  • I picked the four converters that are most likely to embed multi-megabyte encrypted blobs (zip2john, rar2john, dmg2john, keepass2john). If you'd like I can extend the same pattern to others (gpg2john, hccap2john, putty2john, wpapcap2john, etc.) in a follow-up; happy to do whichever subset feels right.
  • Threshold defaults to 1 MiB but is a parameter on `large_output_note_if_input_large(...)`, so per-tool tuning is possible without touching the helper.
  • No DCO sign-off on the commit yet — happy to add it if needed.

…ters (openwall#4051)

The *2john family of converters embed the encrypted blob from the input
archive directly into the JtR hash line written to stdout. For large
archives (encrypted disk images, password-protected RARs, KeePass DBs
with sizeable key files) the resulting hash line can run into hundreds
of megabytes or more, which routinely surprises new users into thinking
the tool has malfunctioned. Issue openwall#4051 asks for a stderr explanation
when this happens.

zip2john already had an unconditional one-line note printed *after*
each output; @gartikis added it in PR openwall#5837. That helped, but it was
late in the timeline (the user has already watched MBs scroll past) and
fired even for tiny archives. This patch makes the same UX hint
size-conditional and extends it to the other three converters most
likely to surprise users.

src/2john_common.h (new)
  Shared static-inline helpers:

    LARGE_OUTPUT_THRESHOLD_BYTES        1 MiB, per the suggestion in openwall#4051.
    large_output_note(progname)         one-shot stderr note explaining
                                        large outputs are normal and
                                        suggesting redirection to a file.
    large_output_note_if_input_large(progname, path, threshold)
                                        stat()s path and only fires the
                                        note when the input is at least
                                        threshold bytes; stat() failures
                                        are silently ignored — the note
                                        is best-effort UX, not a
                                        correctness check.

  static inline so each *2john pulls in just the bytes it uses; no new
  link-time dependency.

src/zip2john.c
  Calls the helper at the top of each per-archive iteration in
  zip2john(). The previous unconditional 'It is normal for some outputs
  to be very large' line at the bottom of process_one() is removed
  (replaced by an inline comment pointing at the new helper) so users
  scanning many small archives no longer see the note 100x.

src/rar2john.c
src/dmg2john.c
src/keepass2john.c
  Same treatment: include 2john_common.h, call the helper before the
  per-file processing function in main()/rar2john().

Test
  Built with the standard 'configure && make' flow (clean build, only a
  pre-existing -Wcpp warning about libbz2 unrelated to this change).
  Live-verified end-to-end:

    tiny.zip     (207 B)   no warning   <- previous behaviour was a
                                           spurious warning per archive
    big.zip      (~2 MiB)  warning      <- as expected
    fake.kdbx    (~2 MiB)  warning via keepass2john

  rar2john / dmg2john compile and link clean; behaviour mirrors the same
  threshold via the shared helper.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant