fix: treat daily-memory no-op as success instead of hard failure by chubes4 · Pull Request #2784 · Extra-Chill/data-machine

chubes4 · 2026-06-24T04:59:00Z

Problem

DailyMemoryTask logged a legitimate "MEMORY.md unchanged" outcome as an ERROR-level hard job failure, generating recurring false-positive noise that polluted error-rate metrics and the wake briefing.

Real log evidence from extrachill.com (and wire.extrachill.com), repeating ~3×/day across multiple agents and every day:

Task failed (job #5151): Daily memory completion policy was not satisfied. MEMORY.md unchanged.

{"job_id":5151,"task_type":"daily_memory_generation","error":"Daily memory completion policy was not satisfied. MEMORY.md unchanged."}

Root cause (investigated, not assumed)

Inspecting live failing jobs confirmed the mechanism:

The job that "failed" (e.g. job #5151, agent_id 17) runs for ~5s with empty token_usage and the error is the exact fallback string — meaning $response['error'] was empty.
That fallback only fires when the conversation loop returns completed=false with no genuine error — i.e. the completion policy never returned complete() within the turn budget.
The affected agent's MEMORY.md is 144 bytes (MAX is 8192), so the file is small and already healthy. The task didn't skip at the size-threshold guard only because there was some activity context that day. The model reviewed the day, found nothing memory-worthy, and never emitted an acceptable ===PERSISTENT=== / ===ARCHIVED=== partition. Nothing is written to disk on this path, so MEMORY.md is genuinely unchanged — a successful no-op, not a fault.

The conversation loop (datamachine_run_conversation) sets completed=false for both genuine faults (provider error, runtime exception, malformed result, budget_exceeded/interrupted/failed status) and this benign "ran out of turns without an acceptable changed split" case. The old code blindly failJob'd both.

Fix

At the completed=false branch in DailyMemoryTask::executeTask(), distinguish the two by explicit error signal:

$genuine_failure = '' !== $response_error
    || ! empty( $response['error_code'] )
    || in_array( (string) ( $response['status'] ?? '' ), array( 'error', 'failed', 'interrupted' ), true );

Genuine failure (any of the above) → unchanged behavior: log at error and failJob.
No-op (no error signal; the model simply produced no acceptable changed split) → completeJob with skipped/no_change markers and log at info. Safe because replace_all() happens later in the method, so MEMORY.md is untouched at this point.

How legitimate no-op is distinguished from genuine failure

Outcome	Signal	Behavior
Provider/runtime error	non-empty `error` / `error_code`	`failJob` (error)
Hard loop failure / interruption	`status` ∈ {error, failed, interrupted}	`failJob` (error)
Empty model output	later `empty($ai_output)` guard	`failJob` (error) — unchanged
Lossy/duplicative split	`planMemoryCompaction` conservation checks	`failJob` (error) — unchanged
Policy unsatisfied, file untouched, no error	none of the above	`completeJob` no-op (info) ✅

All genuine-failure paths downstream (empty response, parse failure, conservation/expansion failures in planMemoryCompaction) are reached only when completed=true and remain loud — they represent a model that produced bad output, which the issue explicitly says must still fail.

Fork decisions / out of scope

Chose to gate on the loop's existing error signals rather than add a new "no-op" state to the Agents API completion-decision substrate. The substrate decision is intentionally binary (complete()/incomplete()); the no-op semantics are a Data-Machine-task concern (the file being untouched at this call site), so the distinction belongs in the task, not the generic loop. This keeps layer purity intact.
Did not change the prompt/completion-policy contract to let the model emit an explicit "no change" sentinel. That would be a larger behavioral change; the minimal, evidence-backed fix is to stop treating the existing untouched-file outcome as an error. A future enhancement could add a first-class "decline" signal, but it's not needed to resolve the noise.

Verification

php -l clean.
phpcs --standard=WordPress clean (exit 0, no warnings — covers the array-arrow/assignment-alignment gate).
phpcbf made no changes.

DailyMemoryTask logged a legitimate 'MEMORY.md unchanged' outcome as an ERROR-level job failure, flooding error-rate metrics and the wake briefing. The conversation loop sets completed=false both for genuine faults (provider error, runtime exception, malformed result, interruption) and for the common case where a small, already-healthy MEMORY.md produced no acceptable PERSISTENT/ARCHIVED split because there was nothing memory-worthy to fold in. The file is untouched at this point, so the latter is a successful no-op, not a failure. Distinguish the two by explicit error signal (non-empty error string, error_code, or error/failed/interrupted status). Genuine faults still failJob and log at error; a no-op completes the job and logs at info. Closes #2783

homeboy-ci · 2026-06-24T05:00:57Z

Homeboy Results — `data-machine`

Lint

❌ lint — failed

ℹ️ Auto-fix: homeboy lint data-machine --path /home/runner/work/data-machine/data-machine --changed-since 8a413e6 --fix (or homeboy refactor data-machine --path /home/runner/work/data-machine/data-machine --changed-since 8a413e6 --from lint --write)
ℹ️ Some issues may require manual fixes
ℹ️ Full options: homeboy docs commands/lint
Deep dive: homeboy lint data-machine --changed-since 8a413e6

Artifacts and drill-down

CI results artifact: homeboy-ci-results-data-machine-lint-quality-Linux-node24 contains immediate command JSON for this action invocation.
Observation artifact: homeboy-observations-data-machine-lint-quality-Linux-node24 contains exported Homeboy run history for deeper queries.
Drill-down: download the observation artifact, then run homeboy runs import <dir>, homeboy runs list, and homeboy runs findings <run-id>.
Artifacts are attached to the workflow run: https://github.com/Extra-Chill/data-machine/actions/runs/28076247259

Test

✅ test — passed

ℹ️ No impacted tests found for --changed-since 8a413e6
ℹ️ Run full suite if needed: homeboy test data-machine
Deep dive: homeboy test data-machine --changed-since 8a413e6

Artifacts and drill-down

CI results artifact: homeboy-ci-results-data-machine-test-quality-Linux-node24 contains immediate command JSON for this action invocation.
Observation artifact: homeboy-observations-data-machine-test-quality-Linux-node24 contains exported Homeboy run history for deeper queries.
Drill-down: download the observation artifact, then run homeboy runs import <dir>, homeboy runs list, and homeboy runs findings <run-id>.
Artifacts are attached to the workflow run: https://github.com/Extra-Chill/data-machine/actions/runs/28076247259

Audit

✅ audit — passed

audit — 28 finding(s)
Total: 28 finding(s)

Deep dive: homeboy audit data-machine --changed-since 8a413e6

Artifacts and drill-down

CI results artifact: homeboy-ci-results-data-machine-audit-quality-Linux-node24 contains immediate command JSON for this action invocation.
Observation artifact: homeboy-observations-data-machine-audit-quality-Linux-node24 contains exported Homeboy run history for deeper queries.
Drill-down: download the observation artifact, then run homeboy runs import <dir>, homeboy runs list, and homeboy runs findings <run-id>.
Artifacts are attached to the workflow run: https://github.com/Extra-Chill/data-machine/actions/runs/28076247259

Tooling versions

Homeboy CLI: homeboy 0.259.0+b3d82bf59679+451de638
Extension: wordpress from https://github.com/Extra-Chill/homeboy-extensions
Extension revision: 94ff2c48
Action: unknown@unknown

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: treat daily-memory no-op as success instead of hard failure#2784

fix: treat daily-memory no-op as success instead of hard failure#2784
chubes4 wants to merge 1 commit into
mainfrom
daily-memory-noop-2783

chubes4 commented Jun 24, 2026

Uh oh!

homeboy-ci Bot commented Jun 24, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

chubes4 commented Jun 24, 2026

Problem

Root cause (investigated, not assumed)

Fix

How legitimate no-op is distinguished from genuine failure

Fork decisions / out of scope

Verification

Uh oh!

homeboy-ci Bot commented Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Homeboy Results — data-machine

Lint

Test

Audit

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

homeboy-ci Bot commented Jun 24, 2026 •

edited

Loading

Homeboy Results — `data-machine`