Skip to content

feat: add caching to loader#1259

Draft
jensneuse wants to merge 156 commits intomasterfrom
feat/add-caching-support
Draft

feat: add caching to loader#1259
jensneuse wants to merge 156 commits intomasterfrom
feat/add-caching-support

Conversation

@jensneuse
Copy link
Member

No description provided.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jul 31, 2025

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 2dd01aac-7207-479b-b00a-833e24bd8fad

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/add-caching-support

Comment @coderabbitai help to get the list of available commands and usage tips.

@jensneuse jensneuse changed the title chore: add ProvidesData + tests feat: add caching to loader Aug 1, 2025
@jensneuse
Copy link
Member Author

jensneuse commented Aug 5, 2025

TODO:

  1. ability for Router to provide a key prefix, e.g. from authorization header
  2. for root fields, we should render all root field names into the key, otherwise root fields won't work
  3. skip mutations
  4. allow partial cache hits
  5. fix case when some keys render to empty string
  6. mix of keys and args

jensneuse and others added 30 commits February 16, 2026 13:31
Replace iteration-based cache log checks with exact assert.Equal on full
[]CacheLogEntry slices. Replace assert.Greater, assert.Contains, and other
vague assertions with exact expected values. Add line-by-line comments
explaining why each cache operation occurred.

Fix items[:0] backing array reuse in handleTriggerEntityCache. Add doc
comments for EntityMergePath store/load semantics, mutation L2 write-through
rationale, threading safety, and nil pointer semantics.

Generated with [Claude Code](https://claude.ai/code)
via [Happy](https://happy.engineering)

Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Happy <yesreply@happy.engineering>
- THREADING comment: "immutable" → "not mutated" for pointer types
- Add comment explaining why items[:0] reuse is unsafe (backing array aliasing)
- Expand subscriptionEvent.id nil semantics with trigger-level rationale

Generated with [Claude Code](https://claude.ai/code)
via [Happy](https://happy.engineering)

Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Happy <yesreply@happy.engineering>
Covers populate, invalidate, typename filtering, typename injection,
and missing cache name. Includes regression test for the items[:0]
backing array reuse bug fixed in cc9b20a.

Generated with [Claude Code](https://claude.ai/code)
via [Happy](https://happy.engineering)

Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Happy <yesreply@happy.engineering>
Add CacheAnalyticsCollector and CacheAnalyticsSnapshot for detailed
cache observability including L1/L2 read/write events, fetch timings,
field hashes, entity type info, shadow comparisons, and mutation events.
Extract loader cache logic into loader_cache.go. Fix E2E test
assertions to account for L1 entity deduplication within requests.
Fix lint issues (gci imports, staticcheck, errcheck).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Update DataSource constants from numeric IDs to names to match remote
branch changes. Replace removed L2Hits/L2Misses field assertions in
EntityMergePath tests with comments explaining event accumulation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… add tests

- Replace 4 assert.Greater with exact assert.Equal (ByteSize=59, count=1)
- Extract populateCachesAfterFetch helper from 4 identical call sites in mergeResult
- Remove unnecessary i := i loop var capture (Go 1.22+)
- Remove orphaned ClearLog() missing GetLog assertion
- Add unit tests for cache_fetch_info.go (was 0% coverage)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Cache entries now store entities using original schema field names (normalized),
and denormalize back to query aliases on load. This ensures aliased fields
hit the cache correctly regardless of the alias used in different queries.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Remove duplicate getFetchInfo call in populateL1Cache
- Fix stale loop variable after denormalization in root fetch L2 path
- Create new arrays in normalizeNode/denormalizeNode instead of mutating in-place

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Resolve 23 merge conflicts integrating master's changes:
- Instance-based Resolver pattern for federation test services
- Singleflight/deduplication for inbound and subgraph requests
- Cost computation system
- RemapVariables, ActualListSizes, and astjson v1.1.0 API updates

Fix cost computation regression: remove fieldPlanners map reset from
EnterDocument to preserve shared reference with CostVisitor.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
## Summary
- Add 5 new tests covering alias behavior in L1/L2 caching
- Root field aliases (`tp: topProducts`) verified to share cache with
original field name
- L1+L2 combined layers tested with aliases on entity fields
- Multiple aliases for same entity in single request verified to use L1
dedup
- Analytics snapshot assertions confirm cache keys use original field
names

## Test Cases Added
1. **L2 hit - aliased root field then original**: Alias and no-alias
queries hit same cache
2. **L2 hit - two different root field aliases**: Different aliases for
same root field share cache
3. **L1+L2 combined - alias entity caching**: Both cache layers work
with aliases
4. **L2 analytics - aliased root field**: Full snapshot assertions
verify correct cache key identity
5. **L1 dedup - multiple aliases for same entity field**: Multiple
aliases deduplicate via L1 within request

All tests pass with `-race` flag.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **Tests**
* Added comprehensive test coverage for federation caching behavior with
field aliases and non-aliased root fields.
* Tests validate cache key semantics across layers, proper deduplication
within requests, and analytics tracking across different alias
configurations.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
## Summary

Updates test framework and test expectations to handle cache analytics
fields that are now populated during planning.

**Key Changes:**
- Add `clearCacheAnalytics()` mechanism to datasourcetesting framework
to strip `CacheAnalytics` from response nodes by default (tests using
`WithEntityCaching()` opt-in)
- Update entity caching test expectations in datasource federation tests
to match new cache analytics values
- Fix field alias handling in test expectations (`OriginalName`,
`HasAliases`)
- Add comprehensive L2 cache analytics test case to
federation_caching_test.go

**Tests:** All v2 and execution tests pass (282 insertions, 1 deletion
across 4 files).

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Tests**
* Added end-to-end cache analytics tests for root-field caching
(argument-driven keys and root-field-only scenarios) and expanded
coverage of L2 hit/miss behavior across query variants.
* Expanded federation tests to validate alias/original-name behavior and
propagation of cache analytics metadata in nested fields.
* Test harness now strips cache analytics by default unless entity
caching is explicitly enabled.

* **Bug Fixes**
  * Standardized a query-missing error message for clearer reporting.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ds (#1415)

## Summary

Adds `TestL1CacheRootFieldNonEntityWithNestedEntities` to validate L1
entity caching when a root field returns a non-entity type (Review)
containing nested entities (User). This complements the existing
`TestL1CacheRootFieldEntityListPopulation` test by proving L1 entity
caching works correctly regardless of the root field's entity status.

## Test Plan

- Added `topReviews` root query to reviews subgraph that returns all
Review objects
- L1 enabled subtest: Verifies sameUserReviewers entity resolution is
completely skipped via L1 cache (1 accounts call)
- L1 disabled subtest: Verifies sameUserReviewers requires a separate
accounts call (2 accounts calls)
- All existing L1 cache tests pass with no regressions

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Added a topReviews query to expose review data for new query
scenarios.

* **Tests**
* Added tests validating L1 cache behavior with nested entities and
arena reuse to ensure cache safety.

* **Bug Fixes**
* Ensured loader resources are freed on all paths and prevented stale L1
cache pointers.

* **Chores**
* Bumped a dependency and replaced unsafe byte-to-string conversions
with safer conversions.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Tests**
* Added test coverage for L1 cache behavior with nested entity handling
in fetch responses.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Adds `TestL1CacheNestedEntitiesInFetchResponse` to document that when an
entity fetch response contains nested entities (e.g. `User.bestFriend`
returning another `User`), those nested entities are NOT separately
extracted and stored in the L1 cache. A subsequent fetch for a nested
entity will still call the subgraph. The test uses gomock `.Times(1)`
expectations to enforce this behavior precisely.

## Checklist

- [ ] I have discussed my proposed changes in an issue and have received
approval to proceed.
- [ ] I have followed the coding standards of the project.
- [x] Tests or benchmarks have been added or updated.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
## Summary

Add per-field configuration to invalidate (delete) L2 cache entries when
mutations return cached entities. Mutations configured with cache
invalidation will immediately delete stale cache entries instead of
waiting for TTL expiration.

## Changes

- **MutationCacheInvalidationConfiguration**: New config type for
per-mutation-field cache invalidation rules
- **InvalidateCache flag**: Added to `MutationEntityImpactConfig` to
enable/disable deletion at runtime
- **Runtime deletion**: `detectMutationEntityImpact()` now calls
`LoaderCache.Delete()` when configured
- **Configuration wiring**: Config flows through `SubgraphCachingConfig`
→ `FederationMetaData` → plan-time annotation
- **E2E tests**: Comprehensive tests verify cache deletion behavior and
confirm analytics still work

## Test Plan

- ✅ New E2E tests in `TestMutationCacheInvalidationE2E` verify delete
behavior
- ✅ Tests confirm mutations with invalidation config delete entries
- ✅ Tests confirm mutations without invalidation config preserve entries
- ✅ Existing mutation impact tests still pass (analytics unaffected)
- ✅ All existing federation caching tests pass

🤖 Generated with [Claude Code](https://claude.com/claude-code)

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Per-subgraph, per-mutation cache invalidation can be configured so
specific mutations automatically remove related L2 cache entries after
execution.
* Mutation impact metadata now exposes per-field invalidation settings
so runtime can act on configured invalidations independently of
analytics.

* **Tests**
* Added end-to-end tests verifying cache deletion occurs when configured
and that cache remains when not configured.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…ation (#1423)

## Summary

Add `L2CacheKeyInterceptor` function type to `CachingOptions`, enabling
library users (like Cosmo Router) to customize L2 cache keys per-request
without modifying graphql-go-tools internals. The interceptor receives
`SubgraphName` and `CacheName` metadata for conditional customization,
and is applied after the existing subgraph header prefix.

## Key Changes

- Added `L2CacheKeyInterceptor` function type and
`L2CacheKeyInterceptorInfo` struct to `v2/pkg/engine/resolve/context.go`
- Applied interceptor in `prepareCacheKeys()` and
`buildMutationEntityCacheKey()` in
`v2/pkg/engine/resolve/loader_cache.go`
- Added comprehensive test suite with 4 subtests covering
transformation, L1 isolation, metadata passing, and nil behavior

## Test Plan

- All 4 new interceptor tests pass
- All existing L1/L2 cache tests pass (no regressions)
- Race detector passes
- Federation caching tests pass

🤖 Generated with [Claude Code](https://claude.com/claude-code)

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **New Features**
* Added optional L2 cache key interception feature, enabling custom
transformation of cache keys during lookups, writes, and deletions.
Interceptor receives contextual metadata including subgraph and cache
names. L1 cache behavior remains unaffected.
* Includes comprehensive test coverage validating interceptor behavior
across multiple caching scenarios.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
## Summary

Add configurable control over whether mutation root fields populate the
L2 cache. Mutations now skip L2 writes by default, with opt-in per-field
configuration via `MutationFieldCacheConfiguration`. Subgraph owners
explicitly enable L2 population for specific mutation fields.

## Changes

- **New type**: `MutationFieldCacheConfiguration` in
`federation_metadata.go` with `EnableEntityL2CachePopulation` flag
- **Plan-time**: `configureFetchCaching` looks up mutation field config
and sets `EnableMutationL2CachePopulation` on `FetchCacheConfiguration`
- **Resolve-time**: `resolveSingle` propagates flag to loader state;
`updateL2Cache` checks flag before writing
- **Config**: `SubgraphCachingConfig` accepts `MutationFieldCaching` for
declaring opt-in behavior per mutation field
- **Default**: Mutations skip L2 writes unless explicitly enabled in
config

## Testing

Added E2E subtest verifying mutations skip L2 writes by default without
configuration.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Per-mutation-field opt-in to allow mutations to populate the L2 entity
cache.

* **Improvements**
* Subgraph-level controls so mutations skip L2 writes by default unless
enabled.
* Mutation-level cache flags are honored throughout fetch and
cache-update flows.

* **Tests**
* Added tests covering mutation L2 population across query/mutation/read
scenarios, key-prefixing, and cross-lookup cases.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…SLOs (#1414)

## Summary

Added three new fields to `FetchTimingEvent` (HTTPStatusCode,
ResponseBytes, TTFBMs) and a new `SubgraphMetrics()` query method on
`CacheAnalyticsSnapshot` to enable external SLO computation in the
schema registry. Per-subgraph metrics are aggregated by the new
`SubgraphRequestMetrics` type, excluding cache hits.

## Test plan

- [x] All existing analytics tests pass (resolve: 187+ tests)
- [x] All existing federation tests pass (execution/engine: 1700+ tests)
- [x] 9 new unit tests for SubgraphMetrics() and enriched fields
- [x] go vet passes with no warnings

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Telemetry now captures HTTP status codes, response sizes, and
time-to-first-byte for subgraph fetches.
* Added per-subgraph fetch metrics and an aggregation view that reports
per-fetch durations, counts, error/status breakdowns, and total bytes
for improved SLO and performance analysis.
* **Tests**
* New tests validate per-fetch metrics, exclusion of cache hits, and
correct handling of status/size/TTFB fields.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
## Summary

Move root field datasource splitting from `execution/engine` into
`v2/pkg/engine/plan` and run it automatically in `NewPlanner`. This
ensures all callers (not just `FederationEngineConfigFactory`)
transparently benefit from the caching optimization.

## Changes

- Add `cloneForSplit` method to `dataSourceConfiguration[T]` for generic
cloning without knowing type parameter
- Create `datasource_split.go` with split logic and `dataSourceSplitter`
interface
- Call `SplitDataSourcesByRootFieldCaching` in `NewPlanner` before
duplicate ID check
- Remove `splitDataSourceByRootFieldCaching` from config factory, revert
call site to direct append
- Migrate 8 unit tests to plan package with full snapshot assertions
- Add auto-split verification test in `planner_test.go`
- Enhance E2E testing with TTL tracking and cache log sorting helpers

## Verification

All 8 split logic tests pass. All planner tests pass (6 pre-split + 1
auto-split). All E2E caching tests pass. No regressions after merge with
origin/master.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Tests**
* Added comprehensive test coverage for cost estimation and actual cost
computation across many list, nesting, union and fragment scenarios.
* Added tests for root-field caching isolation and planner behavior
under varied caching configurations.
* Enhanced cache tests to record and assert TTL propagation and made
cache log ordering deterministic.
* Refactored test helper infrastructure for a unified, configurable
execution test runner and added small test builders/utilities.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Yury Smolski <140245+ysmolski@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…1426)

## Summary

Subgraphs can now signal L2 cache invalidation by including
`extensions.cacheInvalidation.keys` in their GraphQL responses. This
feature works for both query and mutation responses, supports header
prefix and L2CacheKeyInterceptor transformations, and includes
comprehensive unit (16) and E2E (8) test coverage.

## Implementation

- Added `processExtensionsCacheInvalidation()` to parse invalidation
keys from response extensions
- Integrated into the loader's cache population pipeline
- Runtime configuration via entity cache configs map for per-subgraph,
per-entity-type settings

## Checklist

- [x] I have followed the coding standards of the project
- [x] Tests or benchmarks have been added or updated
- [x] Code review findings addressed (assertions, documentation,
optimizations)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Subgraphs can signal cache invalidations via response extensions for
queries and mutations. Supports per-entity runtime configs, optional
subgraph header prefixes, custom key interception, composite keys, and
per-tenant variations. Invalidation runs before cache population to
avoid stale L1/L2 writes and batches L2 deletions safely.
* **Tests**
* Extensive test suites covering many invalidation scenarios and edge
cases (multiple keys, missing/malformed signals,
header-prefix/interceptor interactions, per-tenant behavior, and L1/L2
interactions).
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…1424)

## Summary

Implement entity field argument-aware caching to prevent cache
collisions when the same entity field is fetched with different
arguments. When `friends(first:5)` and `friends(first:20)` are fetched
for the same entity, they are now stored with distinct suffixed field
names (`friends_xxhAAA` and `friends_xxhBBB`) to avoid stale data.

**Key changes:**
- Add `CacheFieldArg` metadata at plan time to capture field arguments
- Compute `_xxh<16hex>` suffix from resolved argument values at resolve
time
- Apply suffixes during L1/L2 cache storage and validation
- Fix critical `HasAliases` gate bug: normalization now fires for
non-aliased fields with CacheArgs
- Skip CacheArgs on root query fields (their args are already in cache
key)
- Use pooled xxhash and manual hex encoding for performance

All tests pass. No race conditions detected.

## Checklist

- [x] I have discussed my proposed changes in an issue and have received
approval to proceed.
- [x] I have followed the coding standards of the project.
- [x] Tests or benchmarks have been added or updated.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Field-level caching now accounts for field arguments to produce
stable, variant-aware cache entries
* User type adds greeting and customGreeting fields with style and
formatting options

* **Tests**
* Added extensive federation caching tests covering argument variants,
aliases, enums, nested inputs, and raw JSON cases
* Refactored subscription tests to use channel-backed updater with
per-update assertions and improved shutdown timeouts
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…rding detection (#1432)

## Summary

Add analytics events to detect when header forwarding rules don't affect
subgraph responses. `HeaderImpactEvent` tracks `(BaseKey, HeaderHash,
ResponseHash)` tuples, enabling cross-request analysis to identify
unnecessary header prefixes in L2 cache keys.

## Changes

- **New event type**: `HeaderImpactEvent` with collector and snapshot
support
- **Hash capture**: Store header hash in result struct and record events
in `updateL2Cache`
- **E2E tests**: Real HTTP headers with three subtests (shadow mode,
non-shadow mode, no-prefix)
- **Unit tests**: Collector deduplication and edge cases
- **Test conventions**: Documented inline-first approach in
`execution/engine/CLAUDE.md`

## Test Plan

- [x] Unit tests for `HeaderImpactEvent` collector (deduplication, empty
cases)
- [x] E2E tests: shadow mode with different headers producing same
responses
- [x] E2E tests: non-shadow mode with L2 miss/hit distinction
- [x] E2E tests: verify no events when header prefix disabled
- [x] Race detector passes on all tests

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Cache analytics now surface header-impact events, correlating request
header variants with L2 cache reads/writes and response hashes;
header-aware hashing is included in cache write analytics.

* **Tests**
* Added end-to-end tests for header variants (shadow vs non-shadow) and
analytics emission on L2 miss/hit.
* Added unit tests for header-impact event deduplication and
preservation rules plus header-forwarding mocks.

* **Documentation**
* Updated testing conventions with explicit header-handling guidance and
mock header cloning.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
## Summary

Comprehensive documentation for the entity caching system and resolve
package execution engine.

## Changes

- **v2/pkg/engine/resolve/CLAUDE.md** (589 lines) - Full resolve package
reference covering the resolution pipeline (Resolver, Loader,
Resolvable) and entity caching internals. Single file because caching is
deeply embedded in the fetch execution flow.
- **ENTITY_CACHING_INTEGRATION.md** (680 lines) - Complete router
integration guide with all public APIs, configuration options, cache key
formats, invalidation mechanisms, analytics, and working examples.
Another agent can fully integrate entity caching using only this file.
- **CLAUDE.md** (rewritten) - High-level repo overview with package map,
data flow, and links to deep references. Replaced
entity-caching-specific content with repo-wide architecture
documentation.
- **execution/engine/CLAUDE.md** - Deleted. Cache log test rules merged
into resolve/CLAUDE.md testing section.

## Verification

Tests pass: `go test ./v2/pkg/engine/resolve/... -count=1`

🤖 Generated with [Claude Code](https://claude.com/claude-code)

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **New Features**
* Implemented two-level entity caching system supporting per-request
in-memory cache and cross-request external cache with configurable TTL
settings.
* Added cache analytics and monitoring capabilities to track cache
performance.

* **Documentation**
* Added comprehensive entity caching integration guide with setup
instructions and usage examples.
* Updated project documentation structure for improved navigation and
reference.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>
Increase read timeout from 1ms to 100ms in subscription handler tests
and fix race condition in AwaitUpdates/AwaitDone test helpers that used
a non-blocking select on a ticker, which could miss the timeout check.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add 48+ new unit tests covering critical cache functions
(validateItemHasRequiredData, normalize/denormalize round-trip,
computeArgSuffix, mergeEntityFields, L2 error resilience, mutation L2
skip) that previously only had indirect E2E coverage. Tests cover edge
cases including nullable/non-nullable combinations, nested objects,
arrays, type mismatches, CacheArgs suffix handling, xxhash determinism,
field merging semantics, and error handling. All tests pass with race
detector enabled.

## Checklist

- [x] Tests added for validateItemHasRequiredData (22 subtests)
- [x] Tests added for normalize/denormalize round-trip (8 subtests)  
- [x] Tests added for computeArgSuffix (7 subtests)
- [x] Tests added for mergeEntityFields (6 subtests)
- [x] Tests added for L2 error resilience (3 subtests)
- [x] Tests added for mutation L2 skip (1 subtest)
- [x] All 48+ new subtests pass
- [x] Race detector passes

🤖 Generated with [Claude Code](https://claude.com/claude-code)

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Tests**
* Added L2 cache error-resilience tests (Get falls through to fetch, Set
errors don't break requests, corrupted entries treated as misses,
mutations skip L2 reads) and introduced helpers to synthesize entity
responses for those scenarios.
* Greatly expanded L1 cache test coverage for normalize/denormalize,
alias & cache-arg handling, nullability, nested/array scenarios,
deterministic arg-suffix behavior, and merge cases.
* **Style**
  * Minor formatting cleanup in several subscription-related tests.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>
…ache op errors (#1435)

## Summary
- **Negative caching**: cache null entity responses (`NegativeCacheTTL`)
to avoid repeated subgraph lookups for non-existent entities
- **Per-goroutine arenas**: fix thread safety for L2 cache allocations
during Phase 2 parallel execution via `l2ArenaPool`
- **Global cache key prefix**: support schema versioning by prepending a
configurable prefix (`GlobalCacheKeyPrefix`) to all L2 cache keys
- **Cache operation error tracking**: record Get/Set/Delete failures in
analytics (`CacheOperationError`) for operator observability
- **Circuit breaker**: protect cache operations with configurable
failure thresholds
- **Comprehensive tests**: negative cache, mutation cache impact, arena
thread safety (bench + GC), circuit breaker, L2 cache key interceptor

## Test plan
- [ ] `go test ./v2/pkg/engine/resolve/... -v` passes
- [ ] `go test ./execution/engine/... -v` passes
- [ ] New negative cache tests cover null sentinel storage and retrieval
- [ ] Arena thread safety bench + GC tests validate no data races under
parallel load
- [ ] Circuit breaker tests verify open/close state transitions

🤖 Generated with [Claude Code](https://claude.com/claude-code)

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
  * Negative caching for null/not-found entities with configurable TTL
  * Circuit breaker protection for cache lookups
  * Global cache key prefix for schema-versioned keys
  * Cache operation TTL logging and enhanced error analytics

* **Tests**
* Negative caching, circuit breaker, mutation-impact, key-interceptor,
L1/L2 e2e, and trigger tests
  * Arena thread-safety benchmarks and GC/concurrency tests

* **Documentation**
  * Entity caching acceptance criteria document
  * Caching test sync/update guideline
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
)

## Summary

Add `TestResolveParallel_NoConcurrentArenaRace` to verify that parallel
entity fetches with L2 caching do not race on arena memory. This test
exercises goroutine code paths in `resolveParallel` Phase 2 that
allocate from per-goroutine arenas, catching regressions if someone
accidentally uses the shared `l.jsonArena` from a goroutine.

## Why

The data race described in the arena-data-race bug report was already
mitigated by per-goroutine arenas (commit 1ad5a75). This test ensures
no regression.

## Test Plan

- Run with `-race` flag: `go test -race -run
TestResolveParallel_NoConcurrentArenaRace ./v2/pkg/engine/resolve/...
-v`
- All resolve tests pass with `-race`: `go test -race
./v2/pkg/engine/resolve/... -count=1`

🤖 Generated with [Claude Code](https://claude.com/claude-code)

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Tests**
* Added robust concurrency tests that validate parallel data fetches
with layered caching (L1/L2) to prevent race conditions across cache-hit
and cache-miss scenarios.
* Introduced a deterministic, thread-safe test data provider and
repeated iterations to amplify concurrency exposure and ensure
consistent results under concurrent loads.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…1437)

Update the `Operation` field comment on `CacheOperationError` to
document all four supported values: `get`, `set`, `set_negative`, and
`delete`.

The `set_negative` operation was already implemented in the code
(loader_cache.go:976) but not documented in the struct comment.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant