Skip to content

fix RETURNING, Statement::reset() and Program::abort()#5478

Merged
jussisaurio merged 11 commits intomainfrom
fix-returning-again
Feb 23, 2026
Merged

fix RETURNING, Statement::reset() and Program::abort()#5478
jussisaurio merged 11 commits intomainfrom
fix-returning-again

Conversation

@jussisaurio
Copy link
Collaborator

@jussisaurio jussisaurio commented Feb 20, 2026

Closes #4388
Closes #5490

Problem 1 - RETURNING implemented wrong

We were not executing RETURNING the same way as SQLite. Our way was:

INSERT ROW
RETURN ROW TO CALLER
INSERT ROW
RETURN ROW TO CALLER
...
COMMIT

SQlite does:

INSERT ROW
BUFFER RETURNING RESULT ROW TO TEMP TABLE
INSERT ROW
BUFFER RETURNING RESULT ROW TO TEMP TABLE
...
RETURN ROW FROM TEMP TABLE TO CALLER
RETURN ROW FROM TEMP TABLE TO CALLER
...
COMMIT

This is a problem because a naive implementation of our approach on main would result in this:

INSERT ROW
RETURN ROW TO CALLER
CALLER READS ROW AND DROPS/ABANDONS STATEMENT

Where not all INSERTs happened, so the statement would roll back. Think from user POV: you just executed INSERT INTO ... RETURNING ... and got a result row back, decided not to step through the rest and moved on. Now your tx is rolled back and the inserts didn't happen. Bad.

Our current workaround on main

On main, if there is a DML operation in progress, on Statement drop, we do a best-effort synchronous completion of the entire statement so that the DML effects of the statement persist.

  1. This is not what SQLite does; instead SQLite immediately calls Halt and commits or rollbacks based on the rc (result code) on the VM.
  2. @pereman2 thinks our run-statement-to-completion behavior is the cause of the bug Autocommit INSERT Returns Busy Error But Data Is Committed #5422 for MVCC.

Fix

  1. Align RETURNING bytecode with SQLite; buffer to temp table and return from there.
  2. Set a special has_returned_row flag indicating that we've emitted a ResultRow to caller.
  3. In reset_internal if there's a DML statement (change_cnt_on) and we've emitted a row to caller, this signifies that all the DML is done and we now invoke Halt synchronously to commit the changes. In all other cases we roll back.
  4. Consistent EXPLAIN naming for ephemeral tables

Problem 2 - Statement::reset() was infallible and not handling cleanup properly

Statement::reset() was infallible, although it can error in several points - errors were just logged. This can mask errors and also leave them unhandled, which can result in e.g. lock leaks.

Fix

  • Make reset() fallible; call reset_best_effort() from Statement::Drop - which does abort cleanup on panic unwind to ensure resources are released.

Problem 3 - Program::abort() could leave subjournal in use on error

This a problem because it can cause persistent Busy on future connections since they can't use the subjournal for statement subtransactions.

Fix

Capture error in Program::abort() and return it at the end, but release subjournal in all cases.

Tests

  • Some synchronous integration tests to verify behavior with RETURNING
  • Simulator tests using MemorySimIO for fault injection, to verify that abandoning statement after seeing StepResult::IO rolls back unless RETURNING scanback had already started.
  • Simulator tests to verify that reset() errors do not cause resource leaks (unfreed WAL locks / subjournal use) during panics or errors propagated from reset()

@LeMikaelF
Copy link
Collaborator

Small regression in this PR: changes() now counts rows in double:

CREATE TABLE t1(a INTEGER PRIMARY KEY, b TEXT);

-- Without RETURNING: correct
INSERT INTO t1 VALUES(1, 'a');
SELECT changes();  -- Returns: 1 ✓

-- With RETURNING: doubled
INSERT INTO t1 VALUES(10, 'r') RETURNING *;
SELECT changes();  -- Returns: 2  (expected: 1)

INSERT INTO t1 VALUES(11, 's'), (12, 't') RETURNING *;
SELECT changes();  -- Returns: 4  (expected: 2)

UPDATE t1 SET b = 'y' WHERE a >= 10 RETURNING *;
SELECT changes();  -- Returns: 6  (expected: 3)

DELETE FROM t1 WHERE a >= 10 RETURNING *;
SELECT changes();  -- Returns: 6  (expected: 3)

break;
}
}
Err(e) => {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this mean that commits can fail silently?

Copy link
Collaborator Author

@jussisaurio jussisaurio Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah - I made reset() fallible in 6847bf2 and added reset_best_effort() for Drop

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had to amend this a bit in 98db50b and c0ac8bc to ensure resources dont leak

41 Insert 0 24 25 0 intkey=r[25] data=r[24]
42 Goto 0 43 0 0
43 Goto 0 44 0 0
44 Rewind 0 50 0 0 Rewind t2
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bug in the comment here, this is the ephemeral table's cursor, but the comment says t2

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed this and made ephemeral table namings consistent in EXPLAIN

@jussisaurio
Copy link
Collaborator Author

Small regression in this PR: changes() now counts rows in double:

CREATE TABLE t1(a INTEGER PRIMARY KEY, b TEXT);

-- Without RETURNING: correct
INSERT INTO t1 VALUES(1, 'a');
SELECT changes();  -- Returns: 1 ✓

-- With RETURNING: doubled
INSERT INTO t1 VALUES(10, 'r') RETURNING *;
SELECT changes();  -- Returns: 2  (expected: 1)

INSERT INTO t1 VALUES(11, 's'), (12, 't') RETURNING *;
SELECT changes();  -- Returns: 4  (expected: 2)

UPDATE t1 SET b = 'y' WHERE a >= 10 RETURNING *;
SELECT changes();  -- Returns: 6  (expected: 3)

DELETE FROM t1 WHERE a >= 10 RETURNING *;
SELECT changes();  -- Returns: 6  (expected: 3)

fixed this - ephemeral inserts for returning were not marked with the ephemeral flag. added test.

Problem: Statement::reset_internal skipped abort/rollback cleanup while the thread was panicking. During unwind, Drop called reset_best_effort(), but reset_internal intentionally avoided cleanup paths, so connection transaction state could remain write-open after a panic.

Why this is a problem: a panic in user code while a statement is running can leave transactional state and locks behind, blocking subsequent writes on the same connection and violating best-effort cleanup guarantees for Drop.

Fix: execute reset cleanup paths regardless of std::thread::panicking(), and make reset_best_effort() panic-safe with catch_unwind so Drop never introduces a second panic. Errors are still logged, but cleanup is attempted during unwind.
@jussisaurio jussisaurio changed the title fix: buffer RETURNING results into temp table and scan back & remove confusing statement reset behavior fix RETURNING && stmt.reset() Feb 23, 2026
Problem: reset/drop cleanup had two correctness holes. First, reset_internal aborted pending completion groups directly, which can panic (group callback invoked before all children finish). Second, abort(None, ...) could skip end_statement paths and leave subjournal ownership marked in-use, causing sticky Busy behavior for later statements.

Why this is a problem: abandoning statements during IO or error handling can leave internal transactional resources in a bad state, trigger panics in cleanup, and block subsequent statements despite best-effort reset/drop semantics.

Fix: drain pending statement IO via wait() during reset instead of force-aborting the group completion; always finalize subjournal ownership in Program::abort; clear uses_subjournal in ProgramState::reset as a defensive invariant; and add simulator regressions for cross-connection reset errors, panic-drop cleanup, and subjournal release on reset/drop.
@jussisaurio jussisaurio changed the title fix RETURNING && stmt.reset() fix RETURNING, Statement::reset() and Program::abort() Feb 23, 2026
let get_table_or_index_name = |cursor_id: usize| {
let mut ephemeral_cursors = HashSet::new();
let mut changed = true;
while changed {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be simplified, if I understand correctly it's meant to prevent adding non-ephemeral cursors from OpenDup to ephemeral_cursors, but OpenDup can only ever be used with ephemeral tables, so line 38 can be done inconditionally, and the while loop can be removed.

e,
"Error committing during statement reset",
);
break;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this could be simplified by breaking a Result<()> out of the loop (break e)

@codspeed-hq
Copy link

codspeed-hq bot commented Feb 23, 2026

Merging this PR will improve performance by 12.68%

⚡ 3 improved benchmarks
✅ 276 untouched benchmarks
⏩ 105 skipped benchmarks1

Performance Changes

Mode Benchmark BASE HEAD Efficiency
Simulation cast_float_to_integer 2.4 µs 2.1 µs +12.68%
Simulation cast_text_to_integer 2.8 µs 2.5 µs +10.62%
Simulation rtrim_spaces 1.1 µs 1 µs +8.42%

Comparing fix-returning-again (d5a271f) with main (470dfb4)

Open in CodSpeed

Footnotes

  1. 105 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@jussisaurio jussisaurio merged commit 6d67663 into main Feb 23, 2026
88 checks passed
@jussisaurio jussisaurio deleted the fix-returning-again branch February 23, 2026 17:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

last_insert_rowid() returns 1 after INSERT with RETURNING RETURNING clause should write everything before starting to yield rows to caller

2 participants