add CLI + decision engine for incremental Kuzu rebuild (PR-T4)#273
Closed
HumanBean17 wants to merge 11 commits into
Closed
add CLI + decision engine for incremental Kuzu rebuild (PR-T4)#273HumanBean17 wants to merge 11 commits into
HumanBean17 wants to merge 11 commits into
Conversation
…R-T4) - Add refresh_decision.py: decision engine that chooses incremental vs full rebuild for Lance and Kuzu based on change detection (git diff or explicit changed_paths). Full rebuild triggered on deletes, renames, config changes, pipeline changes, meta-annotation changes, missing/stale .deps.json, or >50% dirty set. - Update _cmd_increment in CLI: remove stale Kuzu warning, dispatch to incremental or full Kuzu rebuild based on decision engine output. Respect lance_mode for Lance full reprocess. - Add run_build_ast_graph_incremental to pipeline.py: writes changed paths to temp file, calls build_ast_graph.py --changed-paths. - Create refresh_code_index MCP tool in server.py: executes Lance + Kuzu rebuild based on decision engine, returns decision transparency fields. - Add 15 tests: 12 decision engine tests + 3 CLI integration tests. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Make refresh_decision import lazy inside _cmd_increment to avoid ModuleNotFoundError when running as installed entrypoint - Update test_registered_tool_surface_is_v2_navigation_only to include refresh_code_index in expected tool set - Fix CLI test mock paths to patch refresh_decision.choose_refresh_mode instead of java_codebase_rag.cli.choose_refresh_mode Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ky tests Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The MCP server is a read-only query interface. Building/refreshing the index belongs in the CLI (java-codebase-rag increment/reprocess), not as an MCP tool. Remove the tool from server.py, fix test expectations, and update proposals and plan to remove all MCP integration sections. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
e99248a to
bc358b1
Compare
… kuzu_graph for meta test Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Commit f512e87 changed CI from pinned requirements.txt to pyproject.toml ranges, pulling a newer cocoindex that eagerly downloads SentenceTransformer models during init. CI runners can't reach HuggingFace, causing these tests to fail with 401/OSError. Add _heavy_ok() guard that checks both cocoindex binary availability and RUN_HEAVY env var. CI sets RUN_HEAVY=0, so these integration tests are now properly skipped. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Previous fix only covered 2 of 11 search tests that call search_v2 without mocking the SentenceTransformer model loader. Adds the stub to all remaining tests in test_mcp_v2.py and test_mcp_v2_compose.py. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…e_counts The test calls _graph_meta_output() which uses the global KuzuGraph singleton. Other test files reset _instance to None without restoring it, so by the time this test runs the singleton may be gone, causing _graph_meta_output() to create a new (empty) instance. Fix: explicitly re-bind the singleton from the session kuzu_graph fixture before calling _graph_meta_output(). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…_type_counts _graph_meta_output() uses KuzuGraph.get() which resolves the DB path from env vars and may mismatch with the session fixture's tmp_path, causing it to open an empty DB. Call kuzu_graph.meta() directly instead. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Replace git diff change detection with hash-based diff against .deps.json (walk source tree, compare SHA-256 hashes) - Remove RUN_HEAVY=0 from CI workflow so cocoindex integration tests run now that HF_TOKEN is configured - Update plans/proposals to reflect the new detection strategy Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
refresh_decision.py): isolates incremental vs full rebuild logic for both Lance and Kuzu. Detects changes via git diff or explicitchanged_paths. Full rebuild triggered on: deletes, renames, config changes, pipeline changes, meta-annotation changes, missing/stale.deps.json, >50% dirty set.java_codebase_rag/cli.py): removes stale Kuzu incremental warning from_cmd_increment. Dispatches to incremental or full Kuzu rebuild based on decision engine. Respectslance_modefor Lance full reprocess when config/pipeline changes.java_codebase_rag/pipeline.py): addsrun_build_ast_graph_incremental— writes changed paths to temp file, callsbuild_ast_graph.py --changed-paths.server.py): createsrefresh_code_indextool that executes Lance + Kuzu rebuild based on decision engine. Backward compatible (confirm=trueonly still works). Returns decision transparency fields.Scope
Implements PR-T4 from
plans/active/PLAN-TIER2-INCREMENTAL-REBUILD.md.Companion proposal:
propose/active/INDEX-AUTO-MODE-PROPOSE.md.Depends on PR-T3 (incremental orchestrator, already merged).
Manual verification
.venv/bin/ruff check . .venv/bin/python -m pytest tests/test_refresh_decision.py tests/test_cli_increment.py -vFiles changed
refresh_decision.pyChangeSet,RefreshDecision,_choose_refresh_modejava_codebase_rag/cli.py_INCREMENT_WARNING_LINES+_emit_increment_kuzu_warning; update_cmd_incrementto dispatch via decision enginejava_codebase_rag/pipeline.pyrun_build_ast_graph_incrementalwrapperserver.pyrefresh_code_indexMCP tooltests/test_refresh_decision.pytests/test_cli_increment.py🤖 Generated with Claude Code