feat(core): batch-safe hashing with output fingerprinting#34446
feat(core): batch-safe hashing with output fingerprinting#34446FrozenPandaz wants to merge 10 commits intomasterfrom
Conversation
|
View your CI Pipeline Execution ↗ for commit fd6015c
☁️ Nx Cloud last updated this comment at |
cf36e18 to
71d9c98
Compare
✅ Deploy Preview for nx-docs ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
✅ Deploy Preview for nx-dev ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
ef9a35e to
92f349b
Compare
940564b to
b74c720
Compare
ef3c9b5 to
309264d
Compare
When tasks with dependentTasksOutputFiles are co-batched with their dependencies, hashes are computed using outputs from a previous run that may be stale. This adds a validation step that checks whether dependency outputs on disk match the dependency's current hash, skips cache reads for untrustworthy hashes, and re-hashes after the batch completes with fresh outputs.
Add OutputFingerprints Rust struct backed by SQLite for persisting output file checksums. Expose hashTaskOutput via napi so TypeScript can fingerprint outputs without the daemon. Unify outputsHashesMatch to use daemon when available, falling back to DB fingerprints.
With output fingerprinting, outputs that already match the cache are left in place instead of being re-copied. Update e2e assertions to expect "existing outputs match the cache, left as is" instead of "local cache" when daemon is disabled or outputs are untouched.
Add null checks before calling hashTaskOutput to prevent "Failed to convert JavaScript value Undefined into rust type String" when tasks have no outputs defined (e.g. maven tasks). Also update symlink cache test assertion.
With output fingerprinting enabled for all cache operations (not just batch mode), tasks whose outputs are already on disk now correctly report "existing outputs match the cache" instead of "local cache". Updated e2e assertions across cache, run, ng-add, and nx-init-angular tests to expect the new status when outputs haven't been deleted.
Wrap identifyTasksWithStaleDepsOutputs and getInputs in try-catch so that targets without proper input configuration (e.g. inferred maven targets) don't crash the entire batch execution.
hashTasks filters out tasks that already have a hash. The previous code cleared hashes on the result copies (created by runBatch via spread) but called hashTasks on batch.taskGraph which holds the originals — still with their hashes. This caused hashTasks to skip them entirely, leaving the copies with undefined hashes that crashed napi when passed to cache.put. Clear hashes on the originals so hashTasks picks them up, then sync the fresh hashes back to the result copies.
The testCompile target hash was missing src/test/java/**/*.java because CacheConfig used the wrong parameter name for the compiler plugin's test source roots. Also removes an unnecessary fallback in MavenExpressionResolver that masked the issue.
ebc28bc to
a270277
Compare
There was a problem hiding this comment.
Important
At least one additional CI pipeline execution has run since the conclusion below was written and it may no longer be applicable.
Nx Cloud has identified a possible root cause for your failed CI:
The e2e-gradle test failure appears to be an environment_state issue rather than a code defect. The failing test is in a project not modified by our PR (e2e-gradle is not in touched_projects), the error occurs in an unchanged test file, and the similar-task-failure-detector confirms this error pattern doesn't exist on master. The process execution failure when running nx show projects has no logical connection to our batch hashing and output fingerprinting changes in task-orchestrator.ts.
No code changes were suggested for this issue.
Trigger a rerun:
🎓 Learn more about Self-Healing CI on nx.dev
Current Behavior
In batch mode, all task hashes are computed upfront before the batch executor runs any tasks. Tasks with
dependentTasksOutputFilesget hashed using whatever dependency outputs happen to be on disk from a previous run. This leads to:Non-batch mode doesn't have this problem because it uses lazy hashing — tasks with
depsOutputsare only hashed after their dependencies complete and fresh outputs exist on disk.Additionally, the
shouldCopyOutputsFromCacheandrecordOutputsHashmethods only worked when the daemon was running. Without the daemon,shouldCopyOutputsFromCachealways returnedtrue(always re-copy) andrecordOutputsHashwas a no-op.Expected Behavior
Batch mode correctly validates whether dependency outputs on disk match the dependency's current hash before trusting any cache entries. When outputs are stale, cache reads are skipped and tasks are re-hashed after the batch completes with fresh outputs. This validation works with or without the daemon.
How it works
depsOutputswhose dependency is also in the batch, check if the dependency's outputs on disk were produced by its current hash. If not, mark the task as stale.hashTasks()again — outputs are now freshpostRunStepsstores cache entries under the correct hashOutput Fingerprinting (daemon-free fallback)
Previously, output freshness validation only worked with the daemon's in-memory tracking. This PR adds a persistent alternative:
OutputFingerprints(Rust/napi): SQLite-backed table that mapstask_hash → fingerprintwhere fingerprint is a deterministic hash of all output fileshashTaskOutput(Rust/napi): Exposed existing Rust output hashing function to TypeScript — uses rayon for parallel file hashingoutputsHashesMatch: Uses the daemon when available, falls back to comparing the current on-disk fingerprint against the stored DB fingerprintrecordOutputsHashnow always persists fingerprints to the DB (in addition to notifying the daemon when available), so subsequent runs can validate outputs even without the daemon.Run-by-run behavior
Files changed
packages/nx/src/native/tasks/output_fingerprints.rspackages/nx/src/native/tasks/hashers/hash_task_output.rspackages/nx/src/native/tasks/task_hasher.rspackages/nx/src/native/tasks/mod.rspackages/nx/src/native/db/initialize.rspackages/nx/src/tasks-runner/task-orchestrator.tsRelated Issue(s)
Related to #30949