test(workflow-operator): pin ConnUtil URL composition without DriverManager#5728
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #5728 +/- ##
============================================
- Coverage 53.07% 52.82% -0.26%
+ Complexity 2657 2654 -3
============================================
Files 1094 1093 -1
Lines 42286 42258 -28
Branches 4541 4541
============================================
- Hits 22444 22322 -122
- Misses 18530 18634 +104
+ Partials 1312 1302 -10
*This pull request uses carry forward flags. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
|
| config | throughput | MB/s | latency | max Δ latest / 7d | |
|---|---|---|---|---|---|
| 🔴 | bs=10 sw=10 sl=64 | 364 | 0.222 | 26,120/42,427/42,427 us | 🔴 +24.1% / 🔴 +21.3% |
| ⚪ | bs=100 sw=10 sl=64 | 801 | 0.489 | 125,129/139,498/139,498 us | ⚪ within ±5% / 🔴 +11.4% |
| ⚪ | bs=1000 sw=10 sl=64 | 923 | 0.563 | 1,083,540/1,116,097/1,116,097 us | ⚪ within ±5% / 🔴 -11.4% |
Baseline details
Latest main 9cede6f from same runner
| config | metric | PR | latest main | 7d avg | Δ latest | Δ 7d |
|---|---|---|---|---|---|---|
| bs=10 sw=10 sl=64 | throughput | 364 tuples/sec | 395 tuples/sec | 410.82 tuples/sec | -7.8% | -11.4% |
| bs=10 sw=10 sl=64 | MB/s | 0.222 MB/s | 0.241 MB/s | 0.251 MB/s | -7.9% | -11.5% |
| bs=10 sw=10 sl=64 | p50 | 26,120 us | 25,568 us | 23,785 us | +2.2% | +9.8% |
| bs=10 sw=10 sl=64 | p95 | 42,427 us | 34,195 us | 34,980 us | +24.1% | +21.3% |
| bs=10 sw=10 sl=64 | p99 | 42,427 us | 34,195 us | 34,980 us | +24.1% | +21.3% |
| bs=100 sw=10 sl=64 | throughput | 801 tuples/sec | 828 tuples/sec | 891.94 tuples/sec | -3.3% | -10.2% |
| bs=100 sw=10 sl=64 | MB/s | 0.489 MB/s | 0.505 MB/s | 0.544 MB/s | -3.2% | -10.2% |
| bs=100 sw=10 sl=64 | p50 | 125,129 us | 119,282 us | 112,277 us | +4.9% | +11.4% |
| bs=100 sw=10 sl=64 | p95 | 139,498 us | 137,379 us | 139,802 us | +1.5% | -0.2% |
| bs=100 sw=10 sl=64 | p99 | 139,498 us | 137,379 us | 139,802 us | +1.5% | -0.2% |
| bs=1000 sw=10 sl=64 | throughput | 923 tuples/sec | 921 tuples/sec | 1,041 tuples/sec | +0.2% | -11.3% |
| bs=1000 sw=10 sl=64 | MB/s | 0.563 MB/s | 0.562 MB/s | 0.635 MB/s | +0.2% | -11.4% |
| bs=1000 sw=10 sl=64 | p50 | 1,083,540 us | 1,080,707 us | 972,714 us | +0.3% | +11.4% |
| bs=1000 sw=10 sl=64 | p95 | 1,116,097 us | 1,153,782 us | 1,023,057 us | -3.3% | +9.1% |
| bs=1000 sw=10 sl=64 | p99 | 1,116,097 us | 1,153,782 us | 1,023,057 us | -3.3% | +9.1% |
Raw CSV
config_idx,batch_size,schema_width,string_len,num_batches,total_ms,total_tuples,total_bytes,tuples_per_sec,mb_per_sec,lat_p50_us,lat_p95_us,lat_p99_us
0,10,10,64,20,549.29,200,128000,364,0.222,26119.72,42426.98,42426.98
1,100,10,64,20,2496.37,2000,1280000,801,0.489,125129.28,139497.63,139497.63
2,1000,10,64,20,21673.81,20000,12800000,923,0.563,1083540.37,1116096.64,1116096.64|
/request-review @Yicong-Huang |
There was a problem hiding this comment.
Pull request overview
This PR fixes flaky WorkflowOperator JDBC utility tests by removing reliance on JVM-global java.sql.DriverManager state and instead testing the pure, deterministic URL-building logic used by the connection helpers.
Changes:
- Extracted JDBC URL composition into package-private
buildJdbcUrl(host, port, database)helpers for both PostgreSQL and MySQL connection utils. - Updated
connectimplementations to delegate URL construction to the new helpers (behavior unchanged). - Rewrote
PostgreSQLConnUtilSpecandMySQLConnUtilSpecto assert URL composition directly (no driver deregistration/registration, no shared global state).
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| common/workflow-operator/src/main/scala/org/apache/texera/amber/operator/source/sql/postgresql/PostgreSQLConnUtil.scala | Adds buildJdbcUrl and reuses it from connect to keep URL logic pure/testable. |
| common/workflow-operator/src/main/scala/org/apache/texera/amber/operator/source/sql/mysql/MySQLConnUtil.scala | Adds buildJdbcUrl and reuses it from connect, preserving existing URL/query parameter behavior. |
| common/workflow-operator/src/test/scala/org/apache/texera/amber/operator/source/sql/postgresql/PostgreSQLConnUtilSpec.scala | Removes DriverManager mutation and tests URL construction deterministically via buildJdbcUrl. |
| common/workflow-operator/src/test/scala/org/apache/texera/amber/operator/source/sql/mysql/MySQLConnUtilSpec.scala | Removes capturing-driver approach and asserts MySQL URL + required query parameters via buildJdbcUrl. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Yicong-Huang
left a comment
There was a problem hiding this comment.
for unit test, I can accept that we only test or building the jdbc url, but that's unlikely to break anyway since it is concatenating a string.
But how can we test for real postgres/mysql with parallelism in scala/java test suites? We will have tests in amber-intergration. We are pushing the problem to later which I think we need to find a solution eventually.
I can approve this just to get rid of the flakiness. but let's create an issue to track the underlying problem and find a solution to test.
|
@Yicong-Huang or @aglinxinyuan, please merge when available. |
What changes were proposed in this PR?
PostgreSQLConnUtilSpec/MySQLConnUtilSpecwere flaky: they ran a multi-step sequence against the process-globaljava.sql.DriverManagerregistry (deregister the real driver, register a capturing stub, restore), and the module's tests run un-forked with sbt's defaultparallelExecution = true, so the suites raced any concurrent suite that touched JDBC in the same JVM.buildJdbcUrl(host, port, database)inPostgreSQLConnUtilandMySQLConnUtil; the existingconnectnow calls it, so the public signature and runtime behavior are unchanged.buildJdbcUrl, with noDriverManagermutation and noBeforeAndAfterAll, so the suites are deterministic under parallel execution.setReadOnly/ credential /SQLExceptionassertions: those exercise JDBC plumbing rather than application logic and were what forced the global-registry manipulation behind the flake.Trade-off: the dropped assertions
The previous specs reached
connectby mutating the globalDriverManager; that seam, not the assertions, is what caused the race. The three dropped behaviors break down as:buildJdbcUrlcompositionsetReadOnly(true)connectforwardsusername/passwordstraight toDriverManager; a test mostly asserts the injected seam, not app logic.SQLExceptionpropagationtry/catchinconnect; the test pins "Scala lets exceptions bubble."These three could be restored without reintroducing the race, via a dependency-injection seam (a package-private
openConnection(url, user, pass, opener = DriverManager.getConnection)that tests call with a fake opener plus aProxy-backedConnection): no global registry mutation, so nothing for parallel suites to race on.Note
Deliberately did not add the DI seam here. For these utilities, the seam would be permanent production complexity in service of tests that mostly verify the mock.
Any related issues, documentation, discussions?
Closes: #5727
How was this PR tested?
sbt "WorkflowOperator/testOnly *PostgreSQLConnUtilSpec *MySQLConnUtilSpec"(under JDK 17); expect 12 tests succeeded, 0 failed across 2 suites.sbt scalafmtCheckAll; expect success (no formatting diffs).DriverManagerstate entirely, so there is no longer cross-suite state to race. Reviewers can confirm by grepping the two specs forDriverManagerandBeforeAndAfterAll, both now absent.Was this PR authored or co-authored using generative AI tooling?
Co-authored with Claude Opus 4.8 in compliance with ASF