refactor: restructure CLAUDE.md for effective context usage

KRRT7 · KRRT7 · commit 0650973d8c6b · 2026-02-14T17:37:51.000-05:00
- Remove commands block from CLAUDE.md (standard tool usage Claude knows)
- Remove dead @AGENTS.md reference
- Add optimization pipeline overview with module pointers
- Add domain glossary (optimization candidate, addressable time, candidate
  forest, replay test, tracer, worktree mode)
- Extract mypy workflow to .claude/skills/fix-mypy.md (on-demand)
- Create .claude/skills/fix-prek.md for prek workflow (on-demand)
- Add key entry points table to architecture.md
- Create path-scoped rules: optimization-patterns.md, language-patterns.md
- Remove redundancy from source-code.md and across rules files
- Move "never use pip" convention to code-style.md
diff --git a/.claude/rules/architecture.md b/.claude/rules/architecture.md
@@ -26,3 +26,17 @@ codeflash/
 ├── result/                 # Result types and handling
 └── version.py              # Version information
 ```
+
+## Key Entry Points
+
+| Task | Start here |
+|------|------------|
+| CLI arguments & commands | `cli_cmds/cli.py` |
+| Optimization orchestration | `optimization/optimizer.py` → `run()` |
+| Per-function optimization | `optimization/function_optimizer.py` |
+| Function discovery | `discovery/functions_to_optimize.py` |
+| Context extraction | `context/code_context_extractor.py` |
+| Test execution | `verification/test_runner.py`, `verification/pytest_plugin.py` |
+| Performance ranking | `benchmarking/function_ranker.py` |
+| Domain types | `models/models.py`, `models/function_types.py` |
+| Result handling | `either.py` (`Result`, `Success`, `Failure`, `is_successful`) |
diff --git a/.claude/rules/code-style.md b/.claude/rules/code-style.md
@@ -2,6 +2,7 @@
 
 - **Line length**: 120 characters
 - **Python**: 3.9+ syntax
+- **Package management**: Always use `uv`, never `pip`
 - **Tooling**: Ruff for linting/formatting, mypy strict mode, prek for pre-commit checks
 - **Comments**: Minimal - only explain "why", not "what"
 - **Docstrings**: Do not add unless explicitly requested
diff --git a/.claude/rules/language-patterns.md b/.claude/rules/language-patterns.md
@@ -0,0 +1,12 @@
+---
+paths:
+  - "codeflash/languages/**/*.py"
+---
+
+# Language Support Patterns
+
+- Current language is a module-level singleton in `languages/current.py` — use `set_current_language()` / `current_language()`, never pass language as a parameter through call chains
+- Use `get_language_support(identifier)` from `languages/registry.py` to get a `LanguageSupport` instance — never import language classes directly
+- New language support classes must use the `@register_language` decorator to register with the extension and language registries
+- `languages/__init__.py` uses `__getattr__` for lazy imports to avoid circular dependencies — follow this pattern when adding new exports
+- `is_javascript()` returns `True` for both JavaScript and TypeScript
diff --git a/.claude/rules/optimization-patterns.md b/.claude/rules/optimization-patterns.md
@@ -0,0 +1,17 @@
+---
+paths:
+  - "codeflash/optimization/**/*.py"
+  - "codeflash/verification/**/*.py"
+  - "codeflash/benchmarking/**/*.py"
+  - "codeflash/context/**/*.py"
+---
+
+# Optimization Pipeline Patterns
+
+- All major operations return `Result[SuccessType, ErrorType]` — construct with `Success(value)` / `Failure(error)`, check with `is_successful()` before calling `unwrap()`
+- Code context has token limits (`OPTIMIZATION_CONTEXT_TOKEN_LIMIT`, `TESTGEN_CONTEXT_TOKEN_LIMIT` in `config_consts.py`) — exceeding them rejects the function
+- `read_writable_code` can span multiple files; `read_only_context_code` is reference-only
+- Code is serialized as markdown code blocks: ` ```language:filepath\ncode\n``` ` (see `CodeStringsMarkdown`)
+- Candidates form a forest (DAG): refinements/repairs reference `parent_id` on previous candidates
+- Test generation and optimization run concurrently — coordinate through `CandidateEvaluationContext`
+- Generated tests are instrumented with `codeflash_capture.py` to record return values and traces
diff --git a/.claude/rules/source-code.md b/.claude/rules/source-code.md
@@ -6,6 +6,3 @@ paths:
 # Source Code Rules
 
 - Use `libcst` for code modification/transformation to preserve formatting. `ast` is acceptable for read-only analysis and parsing.
-- NEVER use leading underscores for function names (e.g., `_helper`). Python has no true private functions. Always use public names.
-- Any new feature or bug fix that can be tested automatically must have test cases.
-- If changes affect existing test expectations, update the tests accordingly. Tests must always pass after changes.
diff --git a/.claude/rules/testing.md b/.claude/rules/testing.md
@@ -13,3 +13,5 @@ paths:
 - Use `.as_posix()` when converting resolved paths to strings (normalizes to forward slashes).
 - Any new feature or bug fix that can be tested automatically must have test cases.
 - If changes affect existing test expectations, update the tests accordingly. Tests must always pass after changes.
+- The pytest plugin patches `time`, `random`, `uuid`, and `datetime` for deterministic test execution — never assume real randomness or real time in verification tests.
+- `conftest.py` uses an autouse fixture that calls `reset_current_language()` — tests always start with Python as the default language.
diff --git a/.claude/skills/fix-mypy.md b/.claude/skills/fix-mypy.md
@@ -0,0 +1,12 @@
+# Fix mypy errors
+
+When modifying code, fix any mypy type errors in the files you changed:
+
+```bash
+uv run mypy --non-interactive --config-file pyproject.toml <changed_files>
+```
+
+- Fix type annotation issues: missing return types, incorrect types, Optional/None unions, import errors for type hints
+- Do NOT add `# type: ignore` comments — always fix the root cause
+- Do NOT fix type errors that require logic changes, complex generic type rework, or anything that could change runtime behavior
+- Files in `mypy_allowlist.txt` are checked in CI — ensure they remain error-free
diff --git a/.claude/skills/fix-prek.md b/.claude/skills/fix-prek.md
@@ -0,0 +1,9 @@
+# Fix prek failures
+
+When prek (pre-commit) checks fail:
+
+1. Run `uv run prek run` to see failures (local, checks staged files)
+2. In CI, the equivalent is `uv run prek run --from-ref origin/main`
+3. prek runs ruff format, ruff check, and mypy on changed files
+4. Fix issues in order: formatting → lint → type errors
+5. Re-run `uv run prek run` to verify all checks pass
diff --git a/CLAUDE.md b/CLAUDE.md
@@ -1,62 +1,35 @@
 # CLAUDE.md
 
-This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
-
 ## Project Overview
 
 CodeFlash is an AI-powered Python code optimizer that automatically improves code performance while maintaining correctness. It uses LLMs to generate optimization candidates, verifies correctness through test execution, and benchmarks performance improvements.
 
-## Common Commands
-
-```bash
-# Package management (NEVER use pip)
-uv sync                          # Install dependencies
-uv sync --group dev              # Install dev dependencies
-uv add <package>                 # Add a package
-
-# Running tests
-uv run pytest tests/             # Run all tests
-uv run pytest tests/test_foo.py  # Run specific test file
-uv run pytest tests/test_foo.py::test_bar -v  # Run single test
-
-# Type checking and linting
-uv run mypy codeflash/           # Type check
-uv run ruff check codeflash/     # Lint
-uv run ruff format codeflash/    # Format
-
-# Linting (run before committing, checks staged files)
-uv run prek run
-
-# Linting in CI (checks all files changed since main)
-uv run prek run --from-ref origin/main
+## Optimization Pipeline
 
-# Mypy type checking (run on changed files before committing)
-uv run mypy --non-interactive --config-file pyproject.toml <changed_files>
-
-# Running the CLI
-uv run codeflash --help
-uv run codeflash init            # Initialize in a project
-uv run codeflash --all           # Optimize entire codebase
+```
+Discovery → Ranking → Context Extraction → Test Gen + Optimization → Baseline → Candidate Evaluation → PR
 ```
 
-## Mypy Type Checking
+1. **Discovery** (`discovery/`): Find optimizable functions across the codebase
+2. **Ranking** (`benchmarking/function_ranker.py`): Rank functions by addressable time using trace data
+3. **Context** (`context/`): Extract code dependencies (read-writable code + read-only imports)
+4. **Optimization** (`optimization/`, `api/`): Generate candidates via AI service, run in parallel with test generation
+5. **Verification** (`verification/`): Run candidates against tests, compare outputs via custom pytest plugin
+6. **Benchmarking** (`benchmarking/`): Measure performance, select best candidate by speedup
+7. **Result** (`result/`, `github/`): Create PR with winning optimization
 
-When modifying code, fix any mypy type errors in the files you changed. Run mypy on changed files:
+## Domain Glossary
 
-```bash
-uv run mypy --non-interactive --config-file pyproject.toml <changed_files>
-```
-
-Rules:
-- Fix type annotation issues: missing return types, incorrect types, Optional/None unions, import errors for type hints
-- Do NOT add `# type: ignore` comments — always fix the root cause
-- Do NOT fix type errors that require logic changes, complex generic type rework, or anything that could change runtime behavior
-- Files in `mypy_allowlist.txt` are checked in CI — ensure they remain error-free
+- **Optimization candidate**: A generated code variant that might be faster (`OptimizedCandidate`)
+- **Function context**: All code needed for optimization — split into read-writable (modifiable) and read-only (reference)
+- **Addressable time**: Time a function spends that could be optimized (own time + callee time / call count)
+- **Candidate forest**: DAG of candidates where refinements/repairs build on previous candidates
+- **Replay test**: Test generated from recorded benchmark data to reproduce real workloads
+- **Tracer**: Profiling system that records function call trees and timings (`tracing/`, `tracer.py`)
+- **Worktree mode**: Git worktree-based parallel optimization (`--worktree` flag)
 
 <!-- Section below is auto-generated by `tessl install` - do not edit manually -->
 
 # Agent Rules <!-- tessl-managed -->
 
 @.tessl/RULES.md follow the [instructions](.tessl/RULES.md)
-
-@AGENTS.md