A Python library to create a flat JSON index of files and directories.
A simple example with the following directory structure:
├── dir_1
│ ├── file_1.txt
└── file_2.txt
It generates the following flat array of dicts as a JSON file:
[
{
"name": "dir_1",
"type": "directory",
"path": "/"
},
{
"name": "file_1.txt",
"type": "file",
"path": "/dir_1/"
},
{
"name": "file_2.txt",
"type": "file",
"path": "/"
}
]
pip install -e .Or from PyPI:
pip install flatdirMore examples are available at the bottom of this README.md file.
python -m flatdir .Returns a JSON file, by default with some metadata for each entry in the current directory and its subdirectories.
[
{
"name": ".DS_Store",
"path": ".",
"type": "file",
"mtime": "Mon, 23 Feb 2026 13:12:54 GMT",
"size": 6148
}, {
}
...
]Options are provided to customize the output, for instance to remove the default fields, display only files, discard system files, etc.
Use --help to display all available options.
--limit N to limit the number of entries processed:
python -m flatdir . --limit 10--depth N to limit the depth of the directory tree:
python -m flatdir . --depth 2--output FILE to write the result to a file:
python -m flatdir . --output flat.json--fields FILE to add custom fields via a plugin file:
python -m flatdir . --fields my_fields.pyThe plugin file is a Python file where each public function becomes a JSON field.
Each function receives the entry Path and the root directory path.
Return None to omit the field from the output:
# my_fields.py
from pathlib import Path
def ext(path: Path, root: Path) -> str:
return path.suffix
def line_count(path: Path, root: Path) -> int | None:
if path.is_dir():
return None
return len(path.read_text().splitlines())Output (both files and directories are listed):
[
{
"name": "docs",
"path": ".",
"type": "directory",
"mtime": "Mon, 23 Feb 2026 13:12:54 GMT"
},
{
"name": "README.md",
"path": ".",
"type": "file",
"mtime": "Mon, 23 Feb 2026 13:12:54 GMT",
"size": 835,
"ext": ".md",
"line_count": 54
}
]The default fields (name, path, type, mtime, size) are themselves plugins defined
in src/flatdir/plugins/defaults.py. Additional examples are in src/flatdir/plugins/.
All options can be combined:
python -m flatdir . --depth 0 --limit 10 --fields my_fields.py --output result.json--exclude to exclude entries based on a field value:
python -m flatdir . --exclude type=directory --only to include ONLY entries matching a field value (opposite of exclude):
python -m flatdir . --only type=file --only ext=.pyYou can also pass arrays inline formatted as [value1,value2] or strictly valid JSON ["value1", "value2"]. This behaves identically to passing multiple arguments for the same field (OR logic):
python -m flatdir . --only name=["folder_A", "folder_B", "folder_C"]--match PATTERN to include ONLY entries whose name matches a regular expression:
python -m flatdir . --match "^ABC-\d{2}-\d{2}"--sort FIELD to sort entries by a specific field:
python -m flatdir . --sort size--desc to reverse the sort order (descending):
python -m flatdir . --sort size --desc--full_path to include the absolute path of the entry:
python -m flatdir . --fields full_path.pyUse the pattern_PRE_YR1_YR2_LOW_UP.py plugin to parse names matching a specific pattern (e.g., ABC-19-20-aa-BB), splitting the name into pattern_prefix, pattern_year1, pattern_year2, pattern_lower, and pattern_upper fields:
python -m flatdir . --fields src/flatdir/plugins/pattern_PRE_YR1_YR2_LOW_UP.pyYou can then filter based on these extracted attributes using --only. For example, to only include directories that match the prefix PCP:
python -m flatdir . --fields src/flatdir/plugins/pattern_PRE_YR1_YR2_LOW_UP.py --only type=directory --only pattern_prefix=PCPSimilarly, use pattern_PRE_YR1_YR2.py for strictly parsing just the prefix and years (e.g., ABC-19-20) into pattern_prefix, pattern_year1, and pattern_year2:
python -m flatdir . --fields src/flatdir/plugins/pattern_PRE_YR1_YR2.pyTo filter entries based on the pattern of their parent directory, use pattern_parent_PRE_YR1_YR2.py. This extracts parent_pattern_prefix, parent_pattern_year1, and parent_pattern_year2. You can combine this with the file pattern plugin to filter on both the parent's attributes and the file's own attributes simultaneously:
python -m flatdir . \
--fields src/flatdir/plugins/pattern_parent_PRE_YR1_YR2.py \
--fields src/flatdir/plugins/pattern_PRE_YR1_YR2_LOW_UP.py \
--only parent_pattern_prefix=ABC \
--only pattern_lower=aaTo extract an integer sequence ID and the remainder of the filename from items prefixed with numbers (like 00_intro, 01_setup), use pattern_sequence_id.py. This trims out up to 10 leading zeros and returns a natively sortable sequence_id integer and a sequence_name string:
python -m flatdir . --fields src/flatdir/plugins/pattern_sequence_id.py --sort sequence_id--parent to include the relative path to the entry's parent directory:
python -m flatdir . --fields parent.pyTo natively extract the MIME type of a file efficiently based on its file extension (e.g. image/png, application/json), map the mime.py plugin. This is extremely fast because it relies on string resolution without reading file bin data. Unknown signatures default to application/octet-stream and directories omit the field natively.
python -m flatdir . --fields src/flatdir/plugins/mime.pyTo extract plain-text specific properties (such as calculating the text_lines, text_words, text_characters, and text_is_blank booleans) from text-based file formats (.txt, .md, .json, .csv, etc.), use text.py. It operates efficiently via an in-memory cache evaluating binaries exactly once per file:
python -m flatdir . --fields src/flatdir/plugins/text.pyTo extract extended file system properties such as UUIDs, strict ISO 8601 timestamps, CHMOD UNIX permissions, file ownership, and securely size-limited SHA-256 cryptographic signatures, map the extended.py plugin:
python -m flatdir . --fields src/flatdir/plugins/extended.py--nested to format the output as an embedded nested object structure mapping raw directory keys dynamically mirroring the underlying topological hierarchy:
python -m flatdir . --nested--tree transforms the evaluated JSON sequence strictly into a universally standardized tree array compatible seamlessly with D3.js representations (specifically conforming to the flare-2.json hierarchy graph), dynamically rendering parents explicitly possessing a children list encompassing nested objects:
python -m flatdir . --tree--add to inject static fields and values to every entry in the output:
python -m flatdir . --add is_checked=true --add custom_field=NA--ignore-typical dynamically filters out extremely common developer structures ensuring highly recursive nested hidden trees prevent iteration loops (dramatically increasing mapping speed).
Typically blocks: .git, node_modules, __pycache__, .venv, venv, .idea, .vscode, *.egg-info, .DS_Store, Thumbs.db, *.pyc.
python -m flatdir . --ignore-typical--add-depth to conditionally restrict --add parameters exclusively to nodes situated at a specified numerical directory depth:
python -m flatdir . --add is_checked=true --add-depth 1--id to include a unique identifier for each entry:
python -m flatdir . --id--min-depth to strictly cut-off evaluation filtering explicitly shallower hierarchical entries out of processing:
python -m flatdir . --min-depth 2--dict-field KEY[=FILE] to extract the value of KEY from a JSON FILE located within each traversed directory, merging it directly into the directory's resulting JSON object mapping. If the FILE name is omitted, it defaults to using the directory's basename (<dirname>.json). This is highly cache-optimized when fetching multiple keys from the same matched file.
python -m flatdir . --dict-field author=meta.json --dict-field version--no-defaults to omit the default generated fields (name, path, type, size, mtime):
python -m flatdir . --no-defaults --add custom_field=NA--id generates an auto-incrementing integer identifier perfectly corresponding horizontally sorting indexes tracking explicitly returned sequences dynamically post-evaluation.
python -m flatdir . --id--with-headers to wrap the dictionary inside an envelope including headers (execution stats) and entries (the actual layout payload):
python -m flatdir . --with-headers-
Generate a D3.js compatible JSON hierarchical tree from the current directory and save it to a file to generate a treemap chart:
python -m flatdir . --tree > flatdir.json -
Continuous observation within your directory by using Meta's
watchmantool with firstwatchman watchand then trigger the rebuild script matching any pattern changeswatchman-make -p '**/*' --run 'python -m flatdir . --output index.json'
- jq - A command-line JSON processor that can be used to manipulate and query JSON data, including file metadata.
- Nginx Autoindex or Apache mode_autoindex - For serving static files with directory listing capabilities as JSON.
- gron - A command-line tool that transforms JSON into a flat, line-oriented format, making it easier to grep and manipulate with other command-line tools.
- dasel - A command-line tool for querying and manipulating data structures like JSON, YAML, and XML, to extract file metadata.
- jo - A command-line tool for creating JSON objects, to generate JSON metadata for files and directories.