Skip to content

perf: fix N+1 queries, double embedding, topic health#38

Merged
pszymkowiak merged 1 commit intomainfrom
perf/audit-fixes
Mar 17, 2026
Merged

perf: fix N+1 queries, double embedding, topic health#38
pszymkowiak merged 1 commit intomainfrom
perf/audit-fixes

Conversation

@pszymkowiak
Copy link
Contributor

Summary

  • N+1 vector search: batch fetch memories with single IN (...) query instead of one get() per KNN result
  • Double embedding: compute embedding once in tool_store(), reuse for both store and dedup check (~200ms saved per store)
  • Topic health: consolidate 6 separate SQL queries into 1 aggregated query (6x fewer DB round-trips)
  • README: add LongMemEval (ICLR 2025) benchmark results — 100% retrieval on 500 questions
  • Benchmarks: add bench-longmemeval.py (LLM judge: claude/ollama) and bench-crossllm.sh (cross-LLM memory sharing)

Test plan

  • 152 tests pass
  • clippy clean
  • cargo fmt clean

- search_by_embedding: batch fetch memories in single IN query instead of N+1 gets
- tool_store: compute embedding once and reuse for both store and dedup check (-200ms/store)
- topic_health: consolidate 6 separate queries into single aggregated SQL query
- README: add LongMemEval benchmark results, remove competitor comparison table
- Add LongMemEval benchmark script with LLM judge support (claude/ollama)
- Add cross-LLM benchmark script (Claude stores, Gemini recalls)
@pszymkowiak pszymkowiak merged commit 8acc3d9 into main Mar 17, 2026
1 check passed
@pszymkowiak pszymkowiak deleted the perf/audit-fixes branch March 17, 2026 14:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant