stream-json is a micro-library of Node.js stream components for creating custom JSON processing pipelines with a minimal memory footprint. It can parse JSON files far exceeding available memory. It has one runtime dependency — stream-chain for pipeline composition.
package.json # Package config; "tape6" section configures test discovery
src/ # Source code
├── index.js # Main entry point: creates Parser + emit()
├── index.d.ts # TypeScript declarations for the main module
├── parser.js # Streaming SAX-like JSON parser (token stream)
├── parser.d.ts # TypeScript declarations for parser
├── assembler.js # Token stream → JavaScript objects (EventEmitter)
├── assembler.d.ts # TypeScript declarations for assembler
├── disassembler.js # JavaScript objects → token stream (generator)
├── disassembler.d.ts # TypeScript declarations for disassembler
├── stringer.js # Token stream → JSON text (flushable function + asStream)
├── stringer.d.ts # TypeScript declarations for stringer
├── emitter.js # Token stream → events (factory → Writable)
├── emitter.d.ts # TypeScript declarations for emitter
├── filters/ # Token stream editors
│ ├── filter-base.js # Base for all filters (filterBase + makeStackDiffer)
│ ├── filter-base.d.ts # TypeScript declarations for filter-base
│ ├── pick.js # Pick subobjects by path (default filterBase)
│ ├── pick.d.ts # TypeScript declarations for pick
│ ├── replace.js # Replace subobjects with a value
│ ├── replace.d.ts # TypeScript declarations for replace
│ ├── ignore.js # Remove subobjects (Replace variant, replacement=none)
│ ├── ignore.d.ts # TypeScript declarations for ignore
│ ├── filter.js # Filter tokens preserving surrounding structure
│ └── filter.d.ts # TypeScript declarations for filter
├── streamers/ # Token stream → object stream
│ ├── stream-base.js # Base for all streamers (uses Assembler internally)
│ ├── stream-base.d.ts # TypeScript declarations for stream-base
│ ├── stream-values.js # Stream successive JSON values (level 0)
│ ├── stream-values.d.ts
│ ├── stream-array.js # Stream array elements (level 1)
│ ├── stream-array.d.ts
│ ├── stream-object.js # Stream object properties (level 1)
│ └── stream-object.d.ts
├── utils/ # Utilities
│ ├── emit.js # Attach token events to a stream
│ ├── emit.d.ts # TypeScript declarations for emit
│ ├── with-parser.js # Create parser + component pipelines via gen()
│ ├── with-parser.d.ts # TypeScript declarations for with-parser
│ ├── batch.js # Batch items into arrays (wraps stream-chain batch)
│ ├── batch.d.ts # TypeScript declarations for batch
│ ├── verifier.js # Validate JSON text (gen pipeline + asStream)
│ ├── verifier.d.ts # TypeScript declarations for verifier
│ ├── utf8-stream.js # Fix multi-byte UTF-8 splits (deprecated, use fixUtf8Stream)
│ ├── utf8-stream.d.ts # TypeScript declarations for utf8-stream
│ ├── flex-assembler.js # Assembler with custom containers (Map, Set, etc.)
│ └── flex-assembler.d.ts # TypeScript declarations for flex-assembler
├── jsonl/ # JSONL (line-separated JSON) support
│ ├── parser.js # JSONL parser → {key, value} objects
│ ├── parser.d.ts # TypeScript declarations for jsonl parser
│ ├── stringer.js # Objects → JSONL text
│ └── stringer.d.ts # TypeScript declarations for jsonl stringer
└── jsonc/ # JSONC (JSON with Comments) support
├── parser.js # JSONC parser → token stream (fork of parser.js)
├── parser.d.ts # TypeScript declarations for jsonc parser
├── stringer.js # JSONC token stream → text (fork of stringer.js)
├── stringer.d.ts # TypeScript declarations for jsonc stringer
├── verifier.js # JSONC validator with error locations (fork of verifier.js)
└── verifier.d.ts # TypeScript declarations for jsonc verifier
tests/ # Test files (test-*.mjs, using tape-six)
bench/ # Micro-benchmarks (nano-benchmark)
wiki/ # GitHub wiki documentation (git submodule)
.github/ # CI workflows, Dependabot config
The parser produces a stream of {name, value} tokens — a SAX-inspired protocol:
| Token name | Value | Meaning |
|---|---|---|
startObject |
— | { encountered |
endObject |
— | } encountered |
startArray |
— | [ encountered |
endArray |
— | ] encountered |
startKey |
— | Start of object key string |
endKey |
— | End of object key string |
keyValue |
string | Packed key value |
startString |
— | Start of string value |
endString |
— | End of string value |
stringChunk |
string | Piece of a string |
stringValue |
string | Packed string value |
startNumber |
— | Start of number |
endNumber |
— | End of number |
numberChunk |
string | Piece of a number |
numberValue |
string | Packed number (as string) |
nullValue |
null | null literal |
trueValue |
true | true literal |
falseValue |
false | false literal |
All downstream components (filters, streamers, stringer, emitter) consume and/or produce tokens in this format. This is the universal interchange protocol of the library.
parser(options)returns agen(fixUtf8Stream(), jsonParser(options))pipeline — a function for use inchain().parser.asStream(options)wraps that pipeline as a Duplex stream viaasStream().- The inner
jsonParseris aflushable()function that maintains a state machine. It buffers incoming text and produces{name, value}tokens as amany()array. - Parser options control packing and streaming of keys, strings, and numbers:
packKeys/packStrings/packNumbers(default: true) — emitkeyValue/stringValue/numberValuetokens with the complete value.streamKeys/streamStrings/streamNumbers(default: true) — emitstart*/*Chunk/end*tokens for incremental processing.packValues/streamValues— shortcut to set all three at once.jsonStreaming— support multiple top-level values (JSON Streaming protocol).
Assembler is an EventEmitter (not a stream) that interprets the token stream and reconstructs JavaScript objects:
Assembler.connectTo(stream)— listens on'data'events, emits'done'when a top-level value is assembled.asm.tapChain— a function for use inchain()that returns assembled values ornone.- Tracks
depth,path,current,key,stack. - Supports
reviveroption (likeJSON.parsereviver) andnumberAsString.
The inverse of Assembler: takes JavaScript objects and produces a token stream via a generator function. Supports replacer, packKeys/packStrings/packNumbers, streamKeys/streamStrings/streamNumbers.
A flushable function that converts a token stream back into JSON text. Handles comma insertion, depth tracking, string escaping. Supports useValues/useKeyValues/useStringValues/useNumberValues to choose between packed and streamed tokens. makeArray option wraps output in []. Use stringer() in chain() or stringer.asStream() for .pipe().
A factory function returning a Writable stream that re-emits each token as a named event: e.on('startObject', ...), etc. Pattern exception: since it's a stream endpoint that emits events, it returns a Writable directly rather than a plain function.
All filters are built on filterBase (src/filters/filter-base.js):
filterBase({specialAction, defaultAction, nonCheckableAction, transition})returns a factory that acceptsoptionsand returns aflushable()function.- It maintains a path stack tracking the current JSON position.
filteroption: a string, RegExp, or function(stack, chunk) → booleanthat determines whether to accept or reject each subobject.makeStackDiffergenerates structural tokens (start/end object/array, key tokens) to reconstruct the surrounding JSON envelope when filtering.
| Filter | specialAction | defaultAction | Effect |
|---|---|---|---|
pick |
accept |
ignore |
Passes only matching subobjects |
replace |
reject |
accept-token |
Replaces matching subobjects |
ignore |
reject |
accept-token |
Removes matching subobjects |
filter |
accept/accept-token |
ignore |
Keeps matching, preserves structure |
All streamers are built on streamBase (src/streamers/stream-base.js):
streamBase({push, first, level})returns a factory that acceptsoptionsand returns a function for use inchain().- Uses
Assemblerinternally to reconstruct objects. levelcontrols when to emit: level 0 forstreamValues, level 1 forstreamArray/streamObject.objectFilteroption enables early rejection: ifobjectFilter(asm)returnsfalse, the object is abandoned without completing assembly.firstcallback validates the opening token (e.g.,streamArrayrequiresstartArray).
| Streamer | Level | Output | Expects |
|---|---|---|---|
streamValues |
0 | {key: index, value: ...} |
Any JSON values in sequence |
streamArray |
1 | {key: index, value: ...} |
Single top-level array |
streamObject |
1 | {key: string, value: ...} |
Single top-level object |
emit(stream)— attaches a'data'listener that re-emits each token as a named event on the stream.withParser(fn, options)— createsgen(parser(options), fn(options)). Most components export.withParser()and.withParserAsStream()static methods.batch— Groups items into fixed-size arrays (default 1000). Wrapsstream-chain/utils/batch. Usebatch()inchain()orbatch.asStream()for.pipe().verifier— Validates JSON text and reports exact error position (offset, line, pos). Composed asgen(fixUtf8Stream(), validate). Useverifier()inchain()orverifier.asStream()for.pipe().Utf8Stream— Deprecated. UsefixUtf8Streamfromstream-chaininstead. Kept for backward compatibility.
jsonl/parser.js— parses JSONL (one JSON value per line) producing{key, value}objects. Composed asgen(fixUtf8Stream(), lines(), parseLine). SupportsreviveranderrorIndicatorfor error handling.jsonl/stringer.js— serializes objects to JSONL format. Delegates tostream-chain/jsonl/stringerStream. Configurableseparator,replacer,space.
jsonc/parser.js— fork ofparser.jswith support for//and/* */comments, trailing commas, and optionalwhitespace/commenttokens. Options:streamWhitespace(default: true),streamComments(default: true). All standard parser options are supported.jsonc/stringer.js— fork ofstringer.jsthat passeswhitespaceandcommenttokens through verbatim. All standard stringer options are supported.jsonc/verifier.js— fork ofutils/verifier.jsthat accepts comments and trailing commas. Reports error offset, line, and position for invalid JSONC.- Downstream compatibility: all existing filters, streamers, and utilities ignore unknown token types, so they work with JSONC parser output unmodified.
src/index.js ── src/parser.js, src/utils/emit.js
│
src/parser.js ── stream-chain (gen, flushable, many, none, asStream, fixUtf8Stream)
src/assembler.js ── stream-chain (none)
src/disassembler.js ── stream-chain (asStream)
src/stringer.js ── stream-chain (flushable, none, asStream)
src/emitter.js ── node:stream (Writable)
src/filters/filter-base.js ── stream-chain (many, isMany, getManyValues, combineManyMut, none, flushable)
src/filters/pick.js ── filter-base.js, with-parser.js
src/filters/replace.js ── stream-chain (none, isMany, getManyValues, combineManyMut, many), filter-base.js, with-parser.js
src/filters/ignore.js ── stream-chain (none), filter-base.js, with-parser.js
src/filters/filter.js ── filter-base.js, with-parser.js
src/streamers/stream-base.js ── stream-chain (none), assembler.js
src/streamers/stream-values.js ── stream-chain (none), stream-base.js, with-parser.js
src/streamers/stream-array.js ── stream-chain (none), stream-base.js, with-parser.js
src/streamers/stream-object.js ── stream-chain (none), stream-base.js, with-parser.js
src/utils/emit.js ── (standalone, no imports)
src/utils/with-parser.js ── stream-chain (asStream, gen), parser.js
src/utils/batch.js ── stream-chain (asStream), stream-chain/utils/batch
src/utils/verifier.js ── stream-chain (gen, flushable, none, asStream, fixUtf8Stream)
src/utils/utf8-stream.js ── node:process, node:stream (Transform), node:string_decoder (deprecated)
src/utils/flex-assembler.js ── stream-chain (none)
src/jsonl/parser.js ── stream-chain (gen, none, asStream, fixUtf8Stream, lines)
src/jsonl/stringer.js ── stream-chain/jsonl/stringerStream
src/jsonc/parser.js ── stream-chain (gen, flushable, many, none, asStream, fixUtf8Stream)
src/jsonc/stringer.js ── stream-chain (flushable, none, asStream)
src/jsonc/verifier.js ── stream-chain (gen, flushable, none, asStream, fixUtf8Stream)
// Main API
const make = require('stream-json'); // parser + emit
const {parser} = require('stream-json'); // parser factory
// Core components
const Assembler = require('stream-json/assembler.js');
const {disassembler} = require('stream-json/disassembler.js');
const stringer = require('stream-json/stringer.js');
const emitter = require('stream-json/emitter.js');
// Filters
const {pick} = require('stream-json/filters/pick.js');
const {replace} = require('stream-json/filters/replace.js');
const {ignore} = require('stream-json/filters/ignore.js');
const {filter} = require('stream-json/filters/filter.js');
// Streamers
const {streamValues} = require('stream-json/streamers/stream-values.js');
const {streamArray} = require('stream-json/streamers/stream-array.js');
const {streamObject} = require('stream-json/streamers/stream-object.js');
// Utilities
const emit = require('stream-json/utils/emit.js');
const withParser = require('stream-json/utils/with-parser.js');
const batch = require('stream-json/utils/batch.js');
const verifier = require('stream-json/utils/verifier.js');
const Utf8Stream = require('stream-json/utils/utf8-stream.js'); // deprecated
const FlexAssembler = require('stream-json/utils/flex-assembler.js');
// JSONL
const jsonlParser = require('stream-json/jsonl/parser.js');
const jsonlStringer = require('stream-json/jsonl/stringer.js');
// JSONC
const jsoncParser = require('stream-json/jsonc/parser.js');
const jsoncStringer = require('stream-json/jsonc/stringer.js');
const jsoncVerifier = require('stream-json/jsonc/verifier.js');- Framework: tape-six (
tape6) - Run all:
npm test(parallel workers viatape6 --flags FO) - Run single file:
node tests/test-<name>.mjs - Run with Bun:
npm run test:bun - Run sequential:
npm run test:proc - TypeScript check:
npm run ts-check - Lint:
npm run lint(Prettier check) - Lint fix:
npm run lint:fix(Prettier write)
Benchmarks use nano-benchmark. Run a benchmark by specifying its file:
npm run bench -- bench/<name>.mjs| File | What it measures |
|---|---|
bench/parser-jsonc.mjs |
Parser vs JSONC Parser on the same ~100 KB JSON array. Measures overhead of comment/trailing-comma support on plain JSON. |
bench/parser-jsonl.mjs |
parser({jsonStreaming: true}) + streamValues() vs jsonl/Parser. Shows native JSON.parse advantage for strict JSONL. |
bench/assembler-flex.mjs |
Assembler vs FlexAssembler (no rules) vs FlexAssembler (Map rules). Feeds pre-generated tokens via consume() — no stream overhead. |
All benchmarks generate synthetic data on the fly (~50–100 KB of mixed-type objects) to isolate component performance from I/O.