Scaling-improvements by PythonFZ · Pull Request #60 · zincware/laufband

PythonFZ · 2025-10-14T19:35:32Z

This commit implements fingerprint-based cache invalidation, allowing tasks
to be automatically re-executed when their fingerprints change. This provides
a mechanism for upstream change detection without requiring manual intervention.

Key Features

Opt-in fingerprint tracking: Controlled by enable_fingerprints flag
or LAUFBAND_ENABLE_FINGERPRINTS environment variable (default: disabled)
INVALIDATED status: New task status indicating a completed task needs
re-execution due to fingerprint changes
Complete audit trail: All fingerprint changes are recorded in the
TaskStatusEntry for full history tracking

Schema Changes

Task.fingerprint: Optional str field for content-based hash
TaskEntry.last_fingerprint: Tracks the last completed fingerprint
TaskStatusEntry.fingerprint: Audit trail of all fingerprints
TaskStatusEnum.INVALIDATED: New status for invalidated tasks

Implementation Details

Fingerprint Invalidation Logic (graphband.py:574-611)

When a completed task is encountered:

Check if fingerprints are enabled and task has a fingerprint
Compare current fingerprint with last_fingerprint
If different, mark task as INVALIDATED and allow re-execution
If same, skip task as before

Property Updates (db.py)

TaskEntry.completed: Returns False for INVALIDATED status
TaskEntry.worker_availability: Returns True for INVALIDATED status
to allow worker assignment for re-execution

Environment Variable Evaluation

Fixed evaluation of enable_fingerprints parameter to check environment
variable at runtime instead of module load time, allowing pytest monkeypatch
to work correctly.

Test Coverage

Unit Tests (12 tests, tests/unit/test_fingerprint_invalidation.py):

Schema validation for new fingerprint fields
INVALIDATED status behavior
completed and worker_availability property logic
Audit trail functionality
Backward compatibility (tasks without fingerprints)

Integration Tests (8 tests, tests/integration/test_fingerprint_integration.py):

Default disabled behavior
Fingerprint invalidation end-to-end
Tasks without fingerprints
Mixed mode (some tasks with/without fingerprints)
Multiple fingerprint changes
Complete audit trail verification
Environment variable configuration
Dependency handling with fingerprints

Test Results

All existing tests pass (96 total tests)
New tests: 20 (12 unit + 8 integration)
Overall test coverage: 95%
No regressions detected

Backward Compatibility

Fingerprints are opt-in (disabled by default)
Existing workflows without fingerprints work unchanged
Schema changes are nullable for backward compatibility

🤖 Generated with Claude Code

Co-Authored-By: Claude noreply@anthropic.com

Summary by CodeRabbit

New Features
- Optional fingerprint-based change detection to auto-invalidate and re-run tasks; configurable via a flag or environment variable.
- Tasks can include an optional fingerprint, enabling skip of unchanged work and re-execution on changes.
Refactor
- Performance improvements: cached “has more jobs” checks, batched dependency lookups, database indexing, and status history pruning.
Documentation
- Added a comprehensive scaling analysis blueprint and updated guidance.
Tests
- New integration and unit tests for fingerprint workflows, caching behavior, and database-related functionality.

This commit implements all Phase 1 recommendations from SCALING_ANALYSIS.md: ✅ 1.1 Database Indexes - Add idx_workflow_status on workers (workflow_id, status) - Add idx_task_status on task_statuses (task_id, status) - Add idx_worker_status on task_statuses (worker_id, status) - Add idx_timestamp on task_statuses (timestamp) - Add idx_requirements on tasks (requirements) - Expected impact: 5-100x faster queries on critical paths ✅ 1.2 Batch Dependency Queries - Replace O(D) sequential queries with single batched query - Use TaskEntry.id.in_(dependencies) for efficient lookup - Create dep_map dictionary for O(1) dependency checks - Expected impact: 10x faster for tasks with many dependencies ✅ 1.3 Prune Status History (method added) - Add prune_status_history() method to TaskEntry - Keeps N most recent status entries (default 10) - Note: Integration into workflow lifecycle deferred to later phase - Expected impact: 50-90% database size reduction (when integrated) ✅ 1.4 Cache has_more_jobs - Implement 5-second TTL cache for has_more_jobs property - Cache invalidated on new iteration - Expected impact: 90% reduction in has_more_jobs queries Test Coverage: - Unit tests for all database indexes - Unit tests for batched dependency queries (all edge cases) - Unit tests for status history pruning - Unit tests for has_more_jobs caching Documentation: - Complete SCALING_ANALYSIS.md with detailed performance analysis - Documents Phases 1-5 with implementation plans - SLURM-specific recommendations for NFS environments Performance Expectations: - Small graphs (100 tasks): 1s → 0.3s per task - Medium graphs (500 tasks): 5s → 1s per task - Large graphs (1000 tasks): 20s → 3s per task 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

…nvalidation This commit implements fingerprint-based cache invalidation, allowing tasks to be automatically re-executed when their fingerprints change. This provides a mechanism for upstream change detection without requiring manual intervention. ## Key Features 1. **Opt-in fingerprint tracking**: Controlled by `enable_fingerprints` flag or `LAUFBAND_ENABLE_FINGERPRINTS` environment variable (default: disabled) 2. **INVALIDATED status**: New task status indicating a completed task needs re-execution due to fingerprint changes 3. **Complete audit trail**: All fingerprint changes are recorded in the TaskStatusEntry for full history tracking ## Schema Changes - `Task.fingerprint`: Optional str field for content-based hash - `TaskEntry.last_fingerprint`: Tracks the last completed fingerprint - `TaskStatusEntry.fingerprint`: Audit trail of all fingerprints - `TaskStatusEnum.INVALIDATED`: New status for invalidated tasks ## Implementation Details ### Fingerprint Invalidation Logic (graphband.py:574-611) When a completed task is encountered: 1. Check if fingerprints are enabled and task has a fingerprint 2. Compare current fingerprint with last_fingerprint 3. If different, mark task as INVALIDATED and allow re-execution 4. If same, skip task as before ### Property Updates (db.py) - `TaskEntry.completed`: Returns False for INVALIDATED status - `TaskEntry.worker_availability`: Returns True for INVALIDATED status to allow worker assignment for re-execution ### Environment Variable Evaluation Fixed evaluation of `enable_fingerprints` parameter to check environment variable at runtime instead of module load time, allowing pytest monkeypatch to work correctly. ## Test Coverage **Unit Tests** (12 tests, tests/unit/test_fingerprint_invalidation.py): - Schema validation for new fingerprint fields - INVALIDATED status behavior - completed and worker_availability property logic - Audit trail functionality - Backward compatibility (tasks without fingerprints) **Integration Tests** (8 tests, tests/integration/test_fingerprint_integration.py): - Default disabled behavior - Fingerprint invalidation end-to-end - Tasks without fingerprints - Mixed mode (some tasks with/without fingerprints) - Multiple fingerprint changes - Complete audit trail verification - Environment variable configuration - Dependency handling with fingerprints ## Test Results - All existing tests pass (96 total tests) - New tests: 20 (12 unit + 8 integration) - Overall test coverage: 95% - No regressions detected ## Backward Compatibility - Fingerprints are opt-in (disabled by default) - Existing workflows without fingerprints work unchanged - Schema changes are nullable for backward compatibility 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

for more information, see https://pre-commit.ci

coderabbitai · 2025-10-14T19:35:57Z

Walkthrough

Adds fingerprint-based invalidation across Graphband and DB models, including a new INVALIDATED status, fingerprint fields, caching for has_more_jobs, and dependency batching. Introduces documentation updates and a new scaling analysis. Adds integration and unit tests for fingerprints, indexes, pruning, and performance caching.

Changes

Cohort / File(s)	Summary
Documentation `AGENTS.md`, `SCALING_ANALYSIS.md`	Adds a guidance line in AGENTS.md; introduces a comprehensive scaling analysis and phased improvement plan.
Database models & enums `laufband/db.py`	Adds TaskStatusEnum.INVALIDATED; new indexes on workers, task_statuses, tasks; adds TaskStatusEntry.fingerprint and TaskEntry.last_fingerprint; implements TaskEntry.prune_status_history(...); updates completed and worker_availability to handle INVALIDATED.
Graph engine (fingerprints, caching, batching) `laufband/graphband.py`	Adds enable_fingerprints parameter and env toggle; implements fingerprint-aware invalidation and audit writes; 5s TTL cache for has_more_jobs with reset on iteration; batches dependency lookups; expanded logging and error paths.
Task API `laufband/task.py`	Adds optional Task.fingerprint field with docstring.
Integration tests (fingerprints) `tests/integration/test_fingerprint_integration.py`	E2E tests for fingerprint enable/disable, mixed presence, dependency invalidation, env toggling, and audit trail (RUNNING/COMPLETED/INVALIDATED).
Unit tests: DB indexes & pruning `tests/unit/test_db_models.py`	Verifies DB indexes exist; tests prune_status_history behavior and preservation of current status.
Unit tests: fingerprint invalidation `tests/unit/test_fingerprint_invalidation.py`	Validates INVALIDATED status, Task/DB fingerprint fields, completion semantics, multi-worker interactions, audit trail across statuses, and backward-compatible null fingerprints.
Unit tests: performance & batching `tests/unit/test_graphband_performance.py`	Tests has_more_jobs cache TTL behavior and batched dependency query handling for various dependency states.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant U as User/Worker
  participant G as Graphband
  participant DB as Database
  participant T as Task

  Note over G: Iteration start → invalidate has_more_jobs cache

  U->>G: request next job
  G->>DB: fetch candidate tasks + deps (batched)
  DB-->>G: tasks, deps status
  alt task is COMPLETED
    opt fingerprints enabled
      G->>DB: read TaskEntry.last_fingerprint
      G->>T: compute/read Task.fingerprint
      alt fingerprint changed
        G->>DB: append status INVALIDATED (with fingerprint)
        DB-->>G: status updated
        G-->>U: schedule task (re-execution)
      else fingerprint unchanged
        G-->>U: skip task
      end
    end
  else task not completed and deps satisfied
    G->>DB: mark RUNNING
    U->>G: complete task (optional fingerprint)
    G->>DB: append COMPLETED (store fingerprint), update last_fingerprint
    G-->>U: ack
  end
  Note over G: Update has_more_jobs cache (TTL 5s)

sequenceDiagram
  autonumber
  participant G as Graphband
  participant C as Cache (TTL 5s)
  participant DB as Database

  G->>C: has_more_jobs?
  alt Cache valid
    C-->>G: cached result
  else Cache expired/miss
    G->>DB: query remaining schedulable tasks
    DB-->>G: result
    G->>C: store result with timestamp
    C-->>G: ack
  end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

seperate db und user lock #53 — Touches the same core modules (graphband initializer, db enums/models), indicating overlapping edits.
add laufband.Graphband #27 — Earlier changes to Graphband/DB that this PR extends with fingerprint invalidation and related fields.

Poem

A whisker twitches, I hop through the DAG,
Sniffing fingerprints where old results lag.
INVALIDATED! I stamp with a thump—
Cache wakes, deps batch, we skip or we jump.
Five-second naps for the queue I patrol—
Carrot logs written, onward I roll! 🥕🐇

Pre-merge checks and finishing touches

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 60.66% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.
Title Check	❓ Inconclusive	The title “Scaling-improvements” is overly generic and does not clearly summarize the primary change of opt-in fingerprint-based cache invalidation and the introduction of an INVALIDATED status, making it difficult for a teammate to grasp the most important update at a glance.	Consider renaming the pull request to explicitly reference the key feature (for example, “Add opt-in fingerprint-based invalidation and INVALIDATED status for task re-execution”) so that the main change is immediately clear to reviewers.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch scaling-improvements

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codecov-commenter · 2025-10-14T19:37:20Z

Codecov Report

❌ Patch coverage is 94.78992% with 31 lines in your changes missing coverage. Please review.
✅ Project coverage is 93.59%. Comparing base (f76706d) to head (f680184).

Files with missing lines	Patch %	Lines
tests/unit/test_graphband_performance.py	88.20%	21 Missing ⚠️
laufband/graphband.py	85.18%	8 Missing ⚠️
tests/integration/test_fingerprint_integration.py	98.91%	2 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main      #60      +/-   ##
==========================================
+ Coverage   92.98%   93.59%   +0.61%     
==========================================
  Files          17       20       +3     
  Lines        1626     2203     +577     
==========================================
+ Hits         1512     2062     +550     
- Misses        114      141      +27

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

laufband/graphband.py (1)

526-538: “stop” failure_policy now conflicts with INVALIDATED.

The check treats any status not in {RUNNING, COMPLETED} as failure. With the new INVALIDATED state, this will wrongly raise for healthy workflows using fingerprints.

Include INVALIDATED in the allowed set:

-                            if task_entry.current_status.status not in [
-                                TaskStatusEnum.RUNNING,
-                                TaskStatusEnum.COMPLETED,
-                            ]:
+                            if task_entry.current_status.status not in [
+                                TaskStatusEnum.RUNNING,
+                                TaskStatusEnum.COMPLETED,
+                                TaskStatusEnum.INVALIDATED,  # re-executable state, not a failure
+                            ]:

Optionally, allow KILLED if within retry budget, depending on intended semantics.

🧹 Nitpick comments (12)

AGENTS.md (1)

13-13: Rephrase to reduce absolute prohibition and avoid confusion with current schema changes.

Consider softening “You MUST NOT use a Migration Strategy!” to guidance that fits this repo’s dev/test focus without blocking future needs.

Suggested tweak:

“Prioritize greenfield changes in this repo; migrations/back-compat are out-of-scope unless explicitly required.”
tests/unit/test_graphband_performance.py (1)
47-51: Avoid hard-coding TTL (test resilience).

Expose/capture TTL via a constant or parameter so tests won’t drift if the TTL changes in Graphband.

Example:
-            if self._cache_timestamp is not None and (now - self._cache_timestamp) < 5:
+            TTL = 5
+            if self._cache_timestamp is not None and (now - self._cache_timestamp) < TTL:
                 return self._has_more_jobs_cache
tests/unit/test_db_models.py (1)
306-316: Rename unused loop variable to underscore (lint).

Minor cleanup to satisfy linters without behavior change.
-    for i in range(14):
+    for _ in range(14):
         mock_time.advance(1)
         db_session.add(
tests/integration/test_fingerprint_integration.py (1)
187-191: Rename unused loop variable in no-op loops.

Use “_” to avoid B007 warnings.
-    for task in worker:
+    for _ in worker:
         pass
Also applies to: 209-213
tests/unit/test_fingerprint_invalidation.py (1)
231-240: Drop unused fixture parameter.

db_session isn’t used directly; task_factory already depends on it.
-@pytest.mark.unit
-def test_fingerprint_field_nullable(db_session, task_factory):
+@pytest.mark.unit
+def test_fingerprint_field_nullable(task_factory):
SCALING_ANALYSIS.md (2)
29-35: Add languages to fenced code blocks (markdownlint MD040).

Several code blocks miss a language spec. Add identifiers like python, bash, text for proper rendering and linting.

Example fix:
-```
+```text
 workflo...
---

`936-940`: **SQLAlchemy 2.x PRAGMA example likely incorrect API.**

Prefer conn.exec_driver_sql("PRAGMA journal_mode=WAL") (or text(...)) over conn.execute("PRAGMA ...") to match SQLAlchemy 2.x semantics.


Suggested snippet:
```python
with engine.connect() as conn:
    conn.exec_driver_sql("PRAGMA journal_mode=WAL")
laufband/db.py (2)

169-170: Index on JSON ‘requirements’.

Be aware SQLite’s JSON is TEXT under the hood; index exists but may have limited selectivity. Consider composite index with workflow_id if queries filter both. Optional.

1-25: Migration note (columns/enums/indexes).

Base.metadata.create_all() won’t update existing schemas. Ensure migrations add last_fingerprint, fingerprint, indexes, and enum value INVALIDATED for existing deployments.

Would you like an Alembic migration stub for these changes?

laufband/graphband.py (3)

386-390: 5s TTL cache check.

Looks correct; cache invalidated on new iteration start. Consider documenting that has_more_jobs may be stale for up to 5s.

291-310: Thread shutdown handled; prefer explicit close over del.

You already provide close() and context manager; del is best-effort and not guaranteed. No change required; just a note.

411-468: has_more_jobs: full-table scan remains (cached).

Caching helps, but this is still O(M) over workflow.tasks. Consider adding indexed queries in DB to avoid scanning all tasks on hot paths, as outlined in the scaling doc (Approach 1.4).

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f76706d and f680184.

📒 Files selected for processing (9)

AGENTS.md (1 hunks)
SCALING_ANALYSIS.md (1 hunks)
laufband/db.py (8 hunks)
laufband/graphband.py (11 hunks)
laufband/task.py (2 hunks)
tests/integration/test_fingerprint_integration.py (1 hunks)
tests/unit/test_db_models.py (2 hunks)
tests/unit/test_fingerprint_invalidation.py (1 hunks)
tests/unit/test_graphband_performance.py (1 hunks)

🧰 Additional context used

📓 Path-based instructions (7)

{laufband/**/*.py,tests/**/*.py}