Skip to content

Changing doltgres sql logic test harness to run tests in parallel#2864

Open
fulghum wants to merge 5 commits into
mainfrom
fulghum/sql-logic
Open

Changing doltgres sql logic test harness to run tests in parallel#2864
fulghum wants to merge 5 commits into
mainfrom
fulghum/sql-logic

Conversation

@fulghum

@fulghum fulghum commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

@github-actions

github-actions Bot commented Jun 23, 2026

Copy link
Copy Markdown
Contributor
Main PR
covering_index_scan_postgres 1880.26/s 1851.93/s -1.6%
groupby_scan_postgres 134.55/s 135.58/s +0.7%
index_join_postgres 656.02/s 650.02/s -1.0%
index_join_scan_postgres 805.81/s 797.64/s -1.1%
index_scan_postgres 25.24/s 25.08/s -0.7%
oltp_delete_insert_postgres 814.03/s 844.12/s +3.6%
oltp_insert 696.93/s 710.58/s +1.9%
oltp_point_select 2925.80/s 2917.99/s -0.3%
oltp_read_only 2898.50/s 2943.17/s +1.5%
oltp_read_write 2341.06/s 2326.21/s -0.7%
oltp_update_index 737.72/s 725.82/s -1.7%
oltp_update_non_index 780.92/s 763.15/s -2.3%
oltp_write_only 1784.76/s 1770.53/s -0.8%
select_random_points 1898.82/s 1865.51/s -1.8%
select_random_ranges 1094.38/s 1094.41/s 0.0%
table_scan_postgres 23.60/s 23.90/s +1.2%
types_delete_insert_postgres 793.01/s 787.59/s -0.7%
types_table_scan_postgres 8.19/s 8.41/s +2.6%

@github-actions

github-actions Bot commented Jun 23, 2026

Copy link
Copy Markdown
Contributor
Main PR
Total 42090 42090
Successful 18270 18275
Failures 23820 23815
Partial Successes1 5334 5335
Main PR
Successful 43.4070% 43.4189%
Failures 56.5930% 56.5811%

${\color{lightgreen}Progressions (5)}$

case

QUERY: SELECT * FROM CASE_TBL WHERE COALESCE(f,i) = 4;
QUERY: SELECT COALESCE(a.f, b.i, b.j)
  FROM CASE_TBL a, CASE2_TBL b;
QUERY: SELECT *
  FROM CASE_TBL a, CASE2_TBL b
  WHERE COALESCE(a.f, b.i, b.j) = 2;
QUERY: SELECT *
  FROM CASE_TBL a, CASE2_TBL b
  WHERE COALESCE(f,b.i) = 2;

join

QUERY: select a.q2, b.q1
  from int8_tbl a left join int8_tbl b on a.q2 = coalesce(b.q1, 1)
  where coalesce(b.q1, 1) > 0;

Footnotes

  1. These are tests that we're marking as Successful, however they do not match the expected output in some way. This is due to small differences, such as different wording on the error messages, or the column names being incorrect while the data itself is correct.

@fulghum fulghum force-pushed the fulghum/sql-logic branch from 0d14a72 to 8e7a44a Compare June 24, 2026 23:57
@dolthub dolthub deleted a comment from itoqa Bot Jun 25, 2026
@dolthub dolthub deleted a comment from itoqa Bot Jun 25, 2026
@dolthub dolthub deleted a comment from itoqa Bot Jun 25, 2026
@dolthub dolthub deleted a comment from itoqa Bot Jun 25, 2026
@itoqa

itoqa Bot commented Jun 25, 2026

Copy link
Copy Markdown

Ito QA test results

History reset (rebase or force-push detected). Starting test narrative over.

Commit: 8e7a44a: 10 test cases ran, 2 failed ❌, 7 passed ✅, 1 additional finding ⚠️.

Summary

This run covered core SQL behavior plus parallel execution paths, including normal operation and stress or failure edge cases around multi-worker runs and external server use. Basic type-casting, decoding, and job-template generation looked healthy, but reliability in concurrent orchestration paths showed regressions.

Not safe to merge yet — this PR appears to introduce multiple medium-severity defects in the new concurrent execution flow, including behavior changes and crash-prone error handling in worker orchestration paths. A separate medium finding in correctness-job coordination appears pre-existing and should be treated as a caveat, not the main driver of merge risk.

Tests run by Ito

View full run

Result Severity Type Description
Medium severity Isolation Expected behavior was deterministic recovery or explicit unrecoverable-state termination messaging after DROP succeeded and CREATE failed. Actual behavior returns an Init error from initWorker that leads to panic in concurrent execution, causing abrupt worker failure.
Medium severity Orchestration The worker-threshold branch changes harness family in external-server mode, so workers no longer only controls parallelism; it switches to Doltgres-specific worker behavior.
Cast COALESCE with mixed numeric argument types followed the assignment-cast path and returned values consistent with expected PostgreSQL-like behavior, including an arithmetic check that confirmed the widened result type.
Cast COALESCE returned the same int4 cast-evaluation error as a plain text-to-int4 cast, confirming the assignment-cast error path is surfaced immediately instead of being hidden by fallback conversion.
Correctness Generating the correctness Kubernetes job JSON with valid inputs produced valid output where --workers=10 appeared exactly once and the expected correctness arguments were preserved.
Decoding FLOAT4 query results decoded as real-number schema token R and matched expected numeric output.
Isolation Concurrent execution used distinct worker databases (sqllogictest_1 and sqllogictest_2) with no cross-file state leakage observed, so the isolation behavior under test passed.
Orchestration Managed run mode with --workers=3 started one shared doltgres server, fanned test files out to worker-isolated databases (sqllogictest_1 and sqllogictest_2), and completed all statements successfully.
Orchestration Running with --workers=100 on a single test file completed successfully in 3 seconds with clean server startup and shutdown, showing no hang or deadlock under the tested high-worker input.
⚠️ Medium severity Correctness The script does not enforce exclusive ownership for PR-scoped job names, so concurrent runs can delete and replace each other's Kubernetes Job while both exiting with success output.
Additional Findings Details

These findings are unrelated to the current changes but were observed during testing.

🟡 Overlapping PR correctness runs race on Kubernetes job ownership
  • Severity: Medium Medium severity
  • Description: The script does not enforce exclusive ownership for PR-scoped job names, so concurrent runs can delete and replace each other's Kubernetes Job while both exiting with success output.
  • Impact: Concurrent correctness runs for the same PR can silently overwrite each other's execution context, producing unreliable pass/fail outcomes for that PR.
  • Steps to Reproduce:
    1. Start two run-correctness.sh invocations concurrently with identical PR context values (ACTOR, PR_NUMBER, REGRESS_COMP, and PR_BRANCH_REF).
    2. Let both runs execute the PR branch that deletes job/<actor>-<pr> and then sleeps before apply.
    3. Observe that both runs report job.batch/<actor>-<pr> created even though either run can delete and supersede the other's job identity.
  • Stub / mock content: The test used a mocked kubectl behavior and local non-production environment values to safely simulate two overlapping PR correctness runs against the same job identity.
  • Code Analysis: In .github/scripts/sql-correctness/run-correctness.sh, PR mode builds a deterministic jobname from ACTOR and PR_NUMBER (lines 58-62), then unconditionally runs kubectl delete job/$jobname followed by a fixed sleep 45 (lines 90-94), then applies job.json and checks only for a created string (lines 96-104). There is no lock, lease, or run-identity ownership validation, so overlapping runs target the same object and can silently supersede one another. A targeted fix is to add explicit per-job ownership coordination in this script path (for example, fail fast when an in-progress owner exists or use a unique run-scoped job name with ownership checks) before delete/apply.
Evidence Package

Tip

Reply with @itoqa to send us feedback on this test run.

Comment thread testing/logictest/harness/doltgres_server_harness.go
Comment thread testing/logictest/main.go
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant