You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The tensor_cache crate provides LLM response caching with exact, semantic
(HNSW), and embedding caches.
Exact Cache (Hash-based O(1))
Operation
Time
lookup_hit
208 ns
lookup_miss
102 ns
Semantic Cache (HNSW-based O(log n))
Operation
Time
lookup_hit
21 us
Put (Exact + Semantic + HNSW insert)
Entries
Time
100
49 us
1,000
47 us
10,000
53 us
Embedding Cache
Operation
Time
lookup_hit
230 ns
lookup_miss
110 ns
Eviction (batch processing)
Entries in Cache
Time
1,000
3.3 us
5,000
4.0 us
10,000
8.4 us
Distance Metrics (raw computation, 128d)
Metric
Time
Notes
Jaccard
73 ns
Fastest, best for sparse
Euclidean
105 ns
Good for spatial data
Cosine
186 ns
Default, best for dense
Angular
193 ns
Alternative to cosine
Semantic Lookup by Metric (1000 entries)
Metric
Time
Jaccard
28.6 us
Euclidean
27.8 us
Cosine
28.4 us
Sparse vs Dense (80% sparsity)
Configuration
Time
Improvement
Dense lookup
28.8 us
baseline
Sparse lookup
24.1 us
16% faster
Auto-Metric Selection
Operation
Time
Sparsity check
0.66 ns
Auto-select dense
13.4 us
Auto-select sparse
16.5 us
Redis Comparison
System
In-Process
Over TCP
Redis
~60 ns
~143 us
tensor_cache (exact)
208 ns
~143 us*
tensor_cache (semantic)
21 us
N/A
*Estimated: network latency dominates (99.9% of time).
Key Insight: For embedded use (no network), Redis is 3.5x faster for exact
lookups. Over TCP (typical deployment), both are network-bound at ~143us. Our
differentiator is semantic search (21us) which Redis cannot provide.