perf(auth): collapse N+1 in getComputingUnitAccess into one join#5648
Conversation
…late in issues or prs
Ci/template compliance warning
into a single LEFT JOIN Replace the two-DAO lookup (fetch unit, then fetch all of the user's access rows and filter in memory) with one LEFT JOIN returning the unit owner and the caller's privilege for this cuid. Missing unit now resolves to NONE instead of throwing; the sole caller maps both to FORBIDDEN.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #5648 +/- ##
============================================
- Coverage 52.85% 52.85% -0.01%
Complexity 2620 2620
============================================
Files 1090 1090
Lines 42176 42178 +2
Branches 4531 4530 -1
============================================
Hits 22293 22293
- Misses 18573 18576 +3
+ Partials 1310 1309 -1
*This pull request uses carry forward flags. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
✅ No material benchmark regressions detected🟢 2 better · 🔴 0 worse · ⚪ 13 noise (<±5%) · 0 without baseline
Baseline detailsLatest main
Raw CSVconfig_idx,batch_size,schema_width,string_len,num_batches,total_ms,total_tuples,total_bytes,tuples_per_sec,mb_per_sec,lat_p50_us,lat_p95_us,lat_p99_us
0,10,10,64,20,460.19,200,128000,435,0.265,21839.71,30613.21,30613.21
1,100,10,64,20,2334.19,2000,1280000,857,0.523,116870.05,144742.12,144742.12
2,1000,10,64,20,20881.12,20000,12800000,958,0.585,1043537.40,1079882.99,1079882.99 |
Yicong-Huang
left a comment
There was a problem hiding this comment.
The direction looks promising. @Ma77Ball a few changes needed:
- can you do a simple time comparison on previous logic vs current logic and put it into PR description? sometimes more SQL queries do not necessarily mean slower.
- we need test coverage.
- please remove unrelated files.
Yicong-Huang
left a comment
There was a problem hiding this comment.
LGTM, thanks for the benchmark result!
What changes were proposed in this PR?
ComputingUnitAccess.getComputingUnitAccess(fetch the unit bycuid, then fetch all of the user's access rows byuidand filter them in memory) with a singleLEFT JOINofWORKFLOW_COMPUTING_UNITtoCOMPUTING_UNIT_USER_ACCESSon(cuid, uid), returning the owner uid and the caller's privilege in one round-trip.WRITE, a present access row yields its privilege, and a null join (no grant) yieldsNONE.NONEinstead of throwing; the sole caller (AccessControlResource) already maps bothNONEand a thrown exception to403 FORBIDDEN, so the externally observable behavior is unchanged.Performance comparison (previous vs current logic)
Measured with a throwaway benchmark against the same embedded Postgres the unit tests use (
MockTexeraDB), exercising the non-owner path (the only path that previously issued a second query). For each grant count the benchmark seeds a user holding that many grants, then times 2000 calls per implementation after a 200-call warmup. Microseconds per call, lower is better:So the rewrite is roughly 2x faster across the board. The speedup stays flat rather than growing with the grant count, which is itself informative: on this loopback setup the dominant cost is the extra database round-trip (two statements vs one), not the in-memory transfer-and-filter of the user's grant rows. Against a real database over a network, where per-statement latency is higher, halving the round-trips should matter at least as much. This addresses the fair concern that "more SQL queries do not necessarily mean slower": here the previous logic issued fewer total joins but paid for an extra round-trip, and measuring confirms the single join wins.
These numbers come from a local micro-benchmark, not committed to the suite, and are absolute timings on embedded Postgres; treat the ratio as the signal, not the raw microseconds.
Any related issues, documentation, discussions?
Closes: #5645
How was this PR tested?
ComputingUnitAccessSpeccovering the owner (WRITE), shared-grant (granted privilege), no-grant (NONE), and missing-unit (NONE) cases.sbt Auth/compileandsbt Auth/test, expect a clean compile of the rewritten jOOQ query and a passing spec.AccessControlResource: request a computing unit as its owner, expectWRITE; as a user holding a shared grant, expect that grant's privilege; as a user with no grant, or for a non-existentcuid, expect403 FORBIDDEN.NONEto403, so the missing-unit path (previously a thrown NPE, nowNONE) still yields403.Was this PR authored or co-authored using generative AI tooling?
Generated-by: Claude Opus 4.8 (Claude Code), in compliance with the ASF Generative Tooling Guidance