Fix statement close behavior for direct results in SEA#1400
Open
gopalldb wants to merge 18 commits intodatabricks:mainfrom
Open
Fix statement close behavior for direct results in SEA#1400gopalldb wants to merge 18 commits intodatabricks:mainfrom
gopalldb wants to merge 18 commits intodatabricks:mainfrom
Conversation
- Drop Windows from cache warmer matrix — Windows runners in
databricks-protected-runner-group lack bash (command not found)
- Remove runner.os from cache key — Maven JARs/POMs are platform-
independent, so one cache entry serves both Linux and Windows
- Cache key is now: maven-deps-{hash(pom.xml)}
Signed-off-by: Gopal Lal <gopal.lal@databricks.com>
Co-authored-by: Isaac
Signed-off-by: Gopal Lal <gopal.lal@databricks.com>
For direct results in SEA, the server returns CLOSED state with inline results. Previously, markAsClosed() set isClosed=true on the JDBC Statement, preventing re-execution and getResultSet() calls. Per JDBC spec, a Statement should remain open until the user explicitly closes it. Changes: - Replace markAsClosed() with markDirectResultsReceived() which sets a new directResultsReceived flag WITHOUT closing the JDBC Statement - Statement remains open for re-execution after direct results - Re-execution implicitly closes the previous ResultSet (per JDBC spec) - close() skips server-side closeStatement when directResultsReceived (server already cleaned up the operation handle) - getExecutionResult() returns cached result for direct results instead of making an RPC that would fail with "not found" - directResultsReceived resets on each new execution Signed-off-by: Gopal Lal <gopal.lal@databricks.com> Co-authored-by: Isaac Signed-off-by: Gopal Lal <gopal.lal@databricks.com>
Fix corner cases and add missing tests: 1. executeAsync() now resets directResultsReceived and closes previous ResultSet before async execution (was missing, causing stale flag) 2. cancel() skips server RPC when directResultsReceived (server already closed the operation, cancel would fail with "not found") 3. getExecutionResult() throws clear error when directResultsReceived but resultSet is null (defensive, unlikely but possible) 4. Added test: getExecutionResult returns cached result for direct results without making an RPC 5. Added test: cancel after direct results is a no-op 6. Added test: re-execution explicitly closes previous ResultSet Signed-off-by: Gopal Lal <gopal.lal@databricks.com> Co-authored-by: Isaac Signed-off-by: Gopal Lal <gopal.lal@databricks.com>
Thrift also receives direct results when TSparkGetDirectResults is set. After the server returns inline results, the operation handle is done — subsequent closeOperation RPCs would fail with "invalid handle". Changes: - DatabricksThriftAccessor.execute(): call markDirectResultsReceived() when direct results are detected (hasResultDataInDirectResults) - Safe instanceof check before casting to DatabricksStatement (test mocks use java.sql.Statement interface) - DatabricksStatement: executeAsync resets directResultsReceived and closes previous ResultSet; cancel() skips RPC for direct results; getExecutionResult() throws clear error if result is null Note: Thrift executeAsync does NOT request direct results (uses runAsync=true), so the async path is unaffected. Signed-off-by: Gopal Lal <gopal.lal@databricks.com> Co-authored-by: Isaac Signed-off-by: Gopal Lal <gopal.lal@databricks.com>
Even with waitTimeout=0s, the server may complete instantly and return CLOSED state with inline results in executeStatementAsync. Added the same markDirectResultsReceived guard to the SEA async path so that getExecutionResult() returns the cached result instead of making an RPC that fails with "not found". Also includes the Thrift direct results fix from the previous commit. Signed-off-by: Gopal Lal <gopal.lal@databricks.com> Co-authored-by: Isaac Signed-off-by: Gopal Lal <gopal.lal@databricks.com>
Signed-off-by: Gopal Lal <gopal.lal@databricks.com> Co-authored-by: Isaac Signed-off-by: Gopal Lal <gopal.lal@databricks.com>
SEA async execution (waitTimeout=0s) never returns direct results — the server always returns PENDING/RUNNING state. Removed the unnecessary CLOSED state check and markDirectResultsReceived call from executeStatementAsync in DatabricksSdkClient. The reset of directResultsReceived in DatabricksStatement.executeAsync() is still needed to clear a stale flag from a previous sync execution. Signed-off-by: Gopal Lal <gopal.lal@databricks.com> Co-authored-by: Isaac Signed-off-by: Gopal Lal <gopal.lal@databricks.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
For direct results in SEA, the server returns CLOSED state with inline results. Previously,
markAsClosed()setisClosed=trueon the JDBC Statement, preventing re-execution andgetResultSet()calls. Per JDBC spec, a Statement should remain open until the user explicitly callsclose().Problem
executeQuery()thengetResultSet()threw "Statement is closed"getExecutionResult()(async path) made an RPC that returned "not found" since the server already closed the operationFix
Replace
markAsClosed()withmarkDirectResultsReceived()which sets a newdirectResultsReceivedflag WITHOUT closing the JDBC Statement:close()skips server-sidecloseStatementwhendirectResultsReceived(server already cleaned up)getExecutionResult()returns cached result for direct results instead of an RPCTest plan
DatabricksStatementTest-- 85 tests pass (3 updated for new behavior)DatabricksSdkClientTest-- 28 tests pass (1 updated for method rename)