docs(plans): execution plan for init/increment perf (PR-P1..PR-P3)#339
Merged
Conversation
Adds plans/active/PLAN-INIT-INCREMENT-PERF.md and the companion plans/AGENT-PROMPTS-INIT-INCREMENT-PERF.md implementing the approved proposal propose/active/INIT-INCREMENT-PERF-PROPOSE.md. Three PRs: - PR-P1: bulk in-memory-pyarrow COPY FROM for the full rebuild path; equivalence harness is the merge gate. - PR-P2: same primitive for the incremental path (Route-MERGE dedup retained). - PR-P3: lifespan-cached LayeredIgnore (ContextKey) + is_ignored _mega memo. No production code. Stacks behind proposal PR #338. Co-Authored-By: Claude <noreply@anthropic.com>
5-lens subagent review of the plan found the PR-P1/P2 boundary was architecturally wrong: the graph write helpers are SHARED between the full and incremental paths, so a "full-path-only" split is impossible. - Verified call graph: _write_edges/_write_routes_and_exposes/_write_nodes_impl/ _write_meta are each called by BOTH paths; _write_clients_producers_and_calls is incremental-only (global pass5/6). - Re-split by write-FUNCTION: PR-P1 = _bulk_copy + _write_edges (the ~250s prize, accelerates both paths); PR-P2 = _write_nodes_impl + _write_routes_and_exposes + _write_clients_producers_and_calls; PR-P3 = ignore cache (independent). - GraphMeta (_write_meta) left on MERGE (shared, one row) — reverses Open Q1. - Fixed all binding sentinel greps: PR-P1 zeros the edge _CREATE_* only; PR-P2 zeros node/route/client constants + _MERGE_SYMBOL only after both routes functions convert; PR-P3 sentinel narrowed to LayeredIgnore(project_root).is_ignored (the bare-constructor grep wrongly matched once-per-run sites :177/:569, which are correctly left alone). - Load-order §1f corrected (UnresolvedCallSite before UNRESOLVED_AT; Route/Client/Producer before their edges). Test files qualified (test_brownfield_routes / test_mcp_v2_compose / test_vectors_progress / test_path_filtering). PR-P2 tests placed in TestIncrementalOrchestrator. Baseline flagged as equivalence anchor, not production invariant. PR-P1 DoD lists the four test names. Co-Authored-By: Claude <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Adds the execution plan for the approved proposal
propose/active/INIT-INCREMENT-PERF-PROPOSE.md(PR #338):plans/active/PLAN-INIT-INCREMENT-PERF.md— 3-PR delivery split.plans/AGENT-PROMPTS-INIT-INCREMENT-PERF.md— self-contained per-PR agent prompts.No production code changed.
Why now
The proposal was reviewed (5-lens subagent review on #338) and is stable. This splits it into execution-ready PRs with binding contracts before any implementation starts.
Highlights
Three PRs, grounded in verified code:
COPY FROMfor the full rebuild path (build_ast_graph.pywrite_ladybug). Adds a_bulk_copy(conn, table, columns, rows)helper using the verified ladybug syntaxconn.execute("COPY <table> FROM $rows", {"rows": pa_table}). Staging invariants spelled out: REL tables stage FROM/TO node ids first; CALLS dedup (seen_calls) +callee_declaring_rolematerialized at staging; node tables loaded before REL tables. Merge gate: a mandatory equivalence harness (test_bulk_write_graph_matches_per_row_baseline+ determinism + sampled-edge rows). Open Q1 resolved (GraphMeta folded in). No ontology bump; re-index-free.MERGE (r:Route)dedup retained by name. Depends on PR-P1.LayeredIgnoreto a cocoindexContextKey(built once per flow run) and memoizeis_ignored's_megaby directory (mega depends on the file's directory, not filename — verified againstpath_filtering.py). Independent of PR-P1/P2.The
AGENT-PROMPTSfile gives each PR a self-contained block (branch/base,@-files, scope, out-of-scope guardrails, pytest commands, binding sentinel greps — e.g._CREATE_SYMBOLmust be gone after PR-P1,MERGE (r:Route)must remain;_MERGE_SYMBOLgone after PR-P2), manual-evidence commands, and exact PR titles.Landing order: PR-P1 → PR-P2; PR-P3 independent.
Tests
Docs-only; baseline unchanged.
Out of scope
AGENT-PROMPTSfile).watchmode (feat:watchmode — keep the index live as files change #336).References
propose/active/INIT-INCREMENT-PERF-PROPOSE.md(PR docs(propose): init/increment perf — bulk graph writes + cached ignore #338) — stacks behind docs(propose): init/increment perf — bulk graph writes + cached ignore #338; merge docs(propose): init/increment perf — bulk graph writes + cached ignore #338 first.watchmode — keep the index live as files change #336 (watch), perf: add ANN vector index if/when flat-scan query latency becomes a bottleneck (parked) #337 (ANN, parked)