Skip to content

test(workflow-operator): pin ConnUtil URL composition without DriverManager#5728

Merged
aglinxinyuan merged 3 commits into
apache:mainfrom
Ma77Ball:fix/connutil-spec-driver-race
Jun 16, 2026
Merged

test(workflow-operator): pin ConnUtil URL composition without DriverManager#5728
aglinxinyuan merged 3 commits into
apache:mainfrom
Ma77Ball:fix/connutil-spec-driver-race

Conversation

@Ma77Ball

@Ma77Ball Ma77Ball commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

What changes were proposed in this PR?

  • PostgreSQLConnUtilSpec / MySQLConnUtilSpec were flaky: they ran a multi-step sequence against the process-global java.sql.DriverManager registry (deregister the real driver, register a capturing stub, restore), and the module's tests run un-forked with sbt's default parallelExecution = true, so the suites raced any concurrent suite that touched JDBC in the same JVM.
  • Extracted the JDBC URL composition (the only application-controlled logic) into a pure buildJdbcUrl(host, port, database) in PostgreSQLConnUtil and MySQLConnUtil; the existing connect now calls it, so the public signature and runtime behavior are unchanged.
  • Rewrote both specs to assert URL composition directly on buildJdbcUrl, with no DriverManager mutation and no BeforeAndAfterAll, so the suites are deterministic under parallel execution.
  • Dropped the former setReadOnly / credential / SQLException assertions: those exercise JDBC plumbing rather than application logic and were what forced the global-registry manipulation behind the flake.

Trade-off: the dropped assertions

The previous specs reached connect by mutating the global DriverManager; that seam, not the assertions, is what caused the race. The three dropped behaviors break down as:

Behavior Regression risk Notes
buildJdbcUrl composition Meaningful (kept) The only logic with string composition / ordering that can plausibly regress.
setReadOnly(true) Low A single constant line with no inputs, but a genuine read-only/efficiency contract.
Credential propagation Negligible connect forwards username/password straight to DriverManager; a test mostly asserts the injected seam, not app logic.
SQLException propagation Negligible No try/catch in connect; the test pins "Scala lets exceptions bubble."

These three could be restored without reintroducing the race, via a dependency-injection seam (a package-private openConnection(url, user, pass, opener = DriverManager.getConnection) that tests call with a fake opener plus a Proxy-backed Connection): no global registry mutation, so nothing for parallel suites to race on.

Note

Deliberately did not add the DI seam here. For these utilities, the seam would be permanent production complexity in service of tests that mostly verify the mock.

Any related issues, documentation, discussions?

Closes: #5727

How was this PR tested?

  • Run sbt "WorkflowOperator/testOnly *PostgreSQLConnUtilSpec *MySQLConnUtilSpec" (under JDK 17); expect 12 tests succeeded, 0 failed across 2 suites.
  • Run sbt scalafmtCheckAll; expect success (no formatting diffs).
  • The original race was timing-dependent and not reproducible on demand; the fix removes the shared DriverManager state entirely, so there is no longer cross-suite state to race. Reviewers can confirm by grepping the two specs for DriverManager and BeforeAndAfterAll, both now absent.

Was this PR authored or co-authored using generative AI tooling?

Co-authored with Claude Opus 4.8 in compliance with ASF

@codecov-commenter

codecov-commenter commented Jun 15, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 50.00000% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 52.82%. Comparing base (9cede6f) to head (a89d339).
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
...mber/operator/source/sql/mysql/MySQLConnUtil.scala 50.00% 1 Missing ⚠️
...tor/source/sql/postgresql/PostgreSQLConnUtil.scala 50.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #5728      +/-   ##
============================================
- Coverage     53.07%   52.82%   -0.26%     
+ Complexity     2657     2654       -3     
============================================
  Files          1094     1093       -1     
  Lines         42286    42258      -28     
  Branches       4541     4541              
============================================
- Hits          22444    22322     -122     
- Misses        18530    18634     +104     
+ Partials       1312     1302      -10     
Flag Coverage Δ *Carryforward flag
access-control-service 70.91% <ø> (ø)
agent-service 34.36% <ø> (ø) Carriedforward from 515d372
amber 53.35% <50.00%> (-0.08%) ⬇️
computing-unit-managing-service 1.65% <ø> (ø)
config-service 56.71% <ø> (ø)
file-service 57.06% <ø> (ø)
frontend 47.40% <ø> (-0.53%) ⬇️ Carriedforward from 515d372
pyamber 90.71% <ø> (+0.93%) ⬆️ Carriedforward from 515d372
python 90.73% <ø> (ø) Carriedforward from 515d372
workflow-compiling-service 58.69% <ø> (ø)

*This pull request uses carry forward flags. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@github-actions

github-actions Bot commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

⚠️ Benchmark changes need a look

🟢 0 better · 🔴 4 worse · ⚪ 11 noise (<±5%) · 0 without baseline

Compared against main 9cede6f benchmarked on this same runner, so the delta is largely free of cross-runner hardware noise. The "7d avg" column still reflects the gh-pages dashboard. Treat <±5% as noise unless repeated.

Dashboard · Run

config throughput MB/s latency max Δ latest / 7d
🔴 bs=10 sw=10 sl=64 364 0.222 26,120/42,427/42,427 us 🔴 +24.1% / 🔴 +21.3%
bs=100 sw=10 sl=64 801 0.489 125,129/139,498/139,498 us ⚪ within ±5% / 🔴 +11.4%
bs=1000 sw=10 sl=64 923 0.563 1,083,540/1,116,097/1,116,097 us ⚪ within ±5% / 🔴 -11.4%
Baseline details

Latest main 9cede6f from same runner

config metric PR latest main 7d avg Δ latest Δ 7d
bs=10 sw=10 sl=64 throughput 364 tuples/sec 395 tuples/sec 410.82 tuples/sec -7.8% -11.4%
bs=10 sw=10 sl=64 MB/s 0.222 MB/s 0.241 MB/s 0.251 MB/s -7.9% -11.5%
bs=10 sw=10 sl=64 p50 26,120 us 25,568 us 23,785 us +2.2% +9.8%
bs=10 sw=10 sl=64 p95 42,427 us 34,195 us 34,980 us +24.1% +21.3%
bs=10 sw=10 sl=64 p99 42,427 us 34,195 us 34,980 us +24.1% +21.3%
bs=100 sw=10 sl=64 throughput 801 tuples/sec 828 tuples/sec 891.94 tuples/sec -3.3% -10.2%
bs=100 sw=10 sl=64 MB/s 0.489 MB/s 0.505 MB/s 0.544 MB/s -3.2% -10.2%
bs=100 sw=10 sl=64 p50 125,129 us 119,282 us 112,277 us +4.9% +11.4%
bs=100 sw=10 sl=64 p95 139,498 us 137,379 us 139,802 us +1.5% -0.2%
bs=100 sw=10 sl=64 p99 139,498 us 137,379 us 139,802 us +1.5% -0.2%
bs=1000 sw=10 sl=64 throughput 923 tuples/sec 921 tuples/sec 1,041 tuples/sec +0.2% -11.3%
bs=1000 sw=10 sl=64 MB/s 0.563 MB/s 0.562 MB/s 0.635 MB/s +0.2% -11.4%
bs=1000 sw=10 sl=64 p50 1,083,540 us 1,080,707 us 972,714 us +0.3% +11.4%
bs=1000 sw=10 sl=64 p95 1,116,097 us 1,153,782 us 1,023,057 us -3.3% +9.1%
bs=1000 sw=10 sl=64 p99 1,116,097 us 1,153,782 us 1,023,057 us -3.3% +9.1%
Raw CSV
config_idx,batch_size,schema_width,string_len,num_batches,total_ms,total_tuples,total_bytes,tuples_per_sec,mb_per_sec,lat_p50_us,lat_p95_us,lat_p99_us
0,10,10,64,20,549.29,200,128000,364,0.222,26119.72,42426.98,42426.98
1,100,10,64,20,2496.37,2000,1280000,801,0.489,125129.28,139497.63,139497.63
2,1000,10,64,20,21673.81,20000,12800000,923,0.563,1083540.37,1116096.64,1116096.64

@Ma77Ball

Copy link
Copy Markdown
Contributor Author

/request-review @Yicong-Huang

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes flaky WorkflowOperator JDBC utility tests by removing reliance on JVM-global java.sql.DriverManager state and instead testing the pure, deterministic URL-building logic used by the connection helpers.

Changes:

  • Extracted JDBC URL composition into package-private buildJdbcUrl(host, port, database) helpers for both PostgreSQL and MySQL connection utils.
  • Updated connect implementations to delegate URL construction to the new helpers (behavior unchanged).
  • Rewrote PostgreSQLConnUtilSpec and MySQLConnUtilSpec to assert URL composition directly (no driver deregistration/registration, no shared global state).

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.

File Description
common/workflow-operator/src/main/scala/org/apache/texera/amber/operator/source/sql/postgresql/PostgreSQLConnUtil.scala Adds buildJdbcUrl and reuses it from connect to keep URL logic pure/testable.
common/workflow-operator/src/main/scala/org/apache/texera/amber/operator/source/sql/mysql/MySQLConnUtil.scala Adds buildJdbcUrl and reuses it from connect, preserving existing URL/query parameter behavior.
common/workflow-operator/src/test/scala/org/apache/texera/amber/operator/source/sql/postgresql/PostgreSQLConnUtilSpec.scala Removes DriverManager mutation and tests URL construction deterministically via buildJdbcUrl.
common/workflow-operator/src/test/scala/org/apache/texera/amber/operator/source/sql/mysql/MySQLConnUtilSpec.scala Removes capturing-driver approach and asserts MySQL URL + required query parameters via buildJdbcUrl.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@Yicong-Huang Yicong-Huang left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for unit test, I can accept that we only test or building the jdbc url, but that's unlikely to break anyway since it is concatenating a string.

But how can we test for real postgres/mysql with parallelism in scala/java test suites? We will have tests in amber-intergration. We are pushing the problem to later which I think we need to find a solution eventually.

I can approve this just to get rid of the flakiness. but let's create an issue to track the underlying problem and find a solution to test.

@Ma77Ball

Ma77Ball commented Jun 16, 2026

Copy link
Copy Markdown
Contributor Author

@Yicong-Huang or @aglinxinyuan, please merge when available.

@aglinxinyuan aglinxinyuan added this pull request to the merge queue Jun 16, 2026
Merged via the queue into apache:main with commit 990b7cb Jun 16, 2026
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Flaky test: PostgreSQLConnUtilSpec races on global DriverManager state under parallel suite execution

5 participants