Changing doltgres sql logic test harness to run tests in parallel by fulghum · Pull Request #2864 · dolthub/doltgresql

fulghum · 2026-06-23T17:14:04Z

github-actions · 2026-06-23T17:30:44Z

	Main	PR
covering_index_scan_postgres	1880.26/s	1851.93/s	-1.6%
groupby_scan_postgres	134.55/s	135.58/s	+0.7%
index_join_postgres	656.02/s	650.02/s	-1.0%
index_join_scan_postgres	805.81/s	797.64/s	-1.1%
index_scan_postgres	25.24/s	25.08/s	-0.7%
oltp_delete_insert_postgres	814.03/s	844.12/s	+3.6%
oltp_insert	696.93/s	710.58/s	+1.9%
oltp_point_select	2925.80/s	2917.99/s	-0.3%
oltp_read_only	2898.50/s	2943.17/s	+1.5%
oltp_read_write	2341.06/s	2326.21/s	-0.7%
oltp_update_index	737.72/s	725.82/s	-1.7%
oltp_update_non_index	780.92/s	763.15/s	-2.3%
oltp_write_only	1784.76/s	1770.53/s	-0.8%
select_random_points	1898.82/s	1865.51/s	-1.8%
select_random_ranges	1094.38/s	1094.41/s	0.0%
table_scan_postgres	23.60/s	23.90/s	+1.2%
types_delete_insert_postgres	793.01/s	787.59/s	-0.7%
types_table_scan_postgres	8.19/s	8.41/s	+2.6%

github-actions · 2026-06-23T18:13:59Z

	Main	PR
Total	42090	42090
Successful	18270	18275
Failures	23820	23815
Partial Successes¹	5334	5335

	Main	PR
Successful	43.4070%	43.4189%
Failures	56.5930%	56.5811%

${\color{lightgreen}Progressions (5)}$

case

QUERY: SELECT * FROM CASE_TBL WHERE COALESCE(f,i) = 4;

QUERY: SELECT COALESCE(a.f, b.i, b.j)
  FROM CASE_TBL a, CASE2_TBL b;

QUERY: SELECT *
  FROM CASE_TBL a, CASE2_TBL b
  WHERE COALESCE(a.f, b.i, b.j) = 2;

QUERY: SELECT *
  FROM CASE_TBL a, CASE2_TBL b
  WHERE COALESCE(f,b.i) = 2;

join

QUERY: select a.q2, b.q1
  from int8_tbl a left join int8_tbl b on a.q2 = coalesce(b.q1, 1)
  where coalesce(b.q1, 1) > 0;

These are tests that we're marking as Successful, however they do not match the expected output in some way. This is due to small differences, such as different wording on the error messages, or the column names being incorrect while the data itself is correct. ↩

…can widen types correctly

…branch

itoqa · 2026-06-25T00:53:35Z

History reset (rebase or force-push detected). Starting test narrative over.

Commit: 8e7a44a: 10 test cases ran, 2 failed ❌, 7 passed ✅, 1 additional finding ⚠️.

Summary

This run covered core SQL behavior plus parallel execution paths, including normal operation and stress or failure edge cases around multi-worker runs and external server use. Basic type-casting, decoding, and job-template generation looked healthy, but reliability in concurrent orchestration paths showed regressions.

Not safe to merge yet — this PR appears to introduce multiple medium-severity defects in the new concurrent execution flow, including behavior changes and crash-prone error handling in worker orchestration paths. A separate medium finding in correctness-job coordination appears pre-existing and should be treated as a caveat, not the main driver of merge risk.

Tests run by Ito

Result	Severity	Type	Description
❌		Isolation	Expected behavior was deterministic recovery or explicit unrecoverable-state termination messaging after DROP succeeded and CREATE failed. Actual behavior returns an Init error from initWorker that leads to panic in concurrent execution, causing abrupt worker failure.
❌		Orchestration	The worker-threshold branch changes harness family in external-server mode, so `workers` no longer only controls parallelism; it switches to Doltgres-specific worker behavior.
✅	—	Cast	COALESCE with mixed numeric argument types followed the assignment-cast path and returned values consistent with expected PostgreSQL-like behavior, including an arithmetic check that confirmed the widened result type.
✅	—	Cast	COALESCE returned the same int4 cast-evaluation error as a plain text-to-int4 cast, confirming the assignment-cast error path is surfaced immediately instead of being hidden by fallback conversion.
✅	—	Correctness	Generating the correctness Kubernetes job JSON with valid inputs produced valid output where --workers=10 appeared exactly once and the expected correctness arguments were preserved.
✅	—	Decoding	FLOAT4 query results decoded as real-number schema token R and matched expected numeric output.
✅	—	Isolation	Concurrent execution used distinct worker databases (sqllogictest_1 and sqllogictest_2) with no cross-file state leakage observed, so the isolation behavior under test passed.
✅	—	Orchestration	Managed run mode with --workers=3 started one shared doltgres server, fanned test files out to worker-isolated databases (sqllogictest_1 and sqllogictest_2), and completed all statements successfully.
✅	—	Orchestration	Running with --workers=100 on a single test file completed successfully in 3 seconds with clean server startup and shutdown, showing no hang or deadlock under the tested high-worker input.
⚠️		Correctness	The script does not enforce exclusive ownership for PR-scoped job names, so concurrent runs can delete and replace each other's Kubernetes Job while both exiting with success output.

Additional Findings Details

These findings are unrelated to the current changes but were observed during testing.

🟡 Overlapping PR correctness runs race on Kubernetes job ownership

Severity: Medium
Description: The script does not enforce exclusive ownership for PR-scoped job names, so concurrent runs can delete and replace each other's Kubernetes Job while both exiting with success output.
Impact: Concurrent correctness runs for the same PR can silently overwrite each other's execution context, producing unreliable pass/fail outcomes for that PR.
Steps to Reproduce:
1. Start two run-correctness.sh invocations concurrently with identical PR context values (ACTOR, PR_NUMBER, REGRESS_COMP, and PR_BRANCH_REF).
2. Let both runs execute the PR branch that deletes job/<actor>-<pr> and then sleeps before apply.
3. Observe that both runs report job.batch/<actor>-<pr> created even though either run can delete and supersede the other's job identity.
Stub / mock content: The test used a mocked kubectl behavior and local non-production environment values to safely simulate two overlapping PR correctness runs against the same job identity.
Code Analysis: In .github/scripts/sql-correctness/run-correctness.sh, PR mode builds a deterministic jobname from ACTOR and PR_NUMBER (lines 58-62), then unconditionally runs kubectl delete job/$jobname followed by a fixed sleep 45 (lines 90-94), then applies job.json and checks only for a created string (lines 96-104). There is no lock, lease, or run-identity ownership validation, so overlapping runs target the same object and can silently supersede one another. A targeted fix is to add explicit per-job ownership coordination in this script path (for example, fail fast when an in-progress owner exists or use a unique run-scoped job name with ownership checks) before delete/apply.

Evidence Package

text/CORRECTNESS-2/CORRECTNESS-2-race-reproduction-and-logs.txt — 3 KB

Tip

Reply with @itoqa to send us feedback on this test run.

fulghum added 5 commits June 23, 2026 16:35

Changing doltgres harness to allow running sql logic tests in parallel

de95fba

Add missing types in test harness

59a71c9

Fix to COALESCE() to use a type cast instead of convert to ensure we …

ff92b6c

…can widen types correctly

Bump sqllogictest dependency to latest version from fulghum/doltgres …

c4a2419

…branch

More changes to support running sqllogictests in parallel

8e7a44a

fulghum force-pushed the fulghum/sql-logic branch from 0d14a72 to 8e7a44a Compare June 24, 2026 23:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Changing doltgres sql logic test harness to run tests in parallel#2864

Changing doltgres sql logic test harness to run tests in parallel#2864
fulghum wants to merge 5 commits into
mainfrom
fulghum/sql-logic

fulghum commented Jun 23, 2026

Uh oh!

github-actions Bot commented Jun 23, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 23, 2026 •

edited

Loading

Uh oh!

itoqa Bot commented Jun 25, 2026

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Uh oh!

Conversation

fulghum commented Jun 23, 2026

Uh oh!

github-actions Bot commented Jun 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

${\color{lightgreen}Progressions (5)}$

case

join

Footnotes

Uh oh!

itoqa Bot commented Jun 25, 2026

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

github-actions Bot commented Jun 23, 2026 •

edited

Loading

github-actions Bot commented Jun 23, 2026 •

edited

Loading