[Refactor] Add ExecExpr typed expression hierarchy for physical plan layer#71251
[Refactor] Add ExecExpr typed expression hierarchy for physical plan layer#71251HangyuanLiu wants to merge 4 commits intoStarRocks:mainfrom
Conversation
…layer Introduce a typed, immutable ExecExpr hierarchy that serves as the physical plan's expression representation, decoupling it from the AST Expr classes. This is a pure addition with no changes to existing code paths. New components: - ExecExpr base class and 28 concrete types (ExecSlotRef, ExecFunctionCall, ExecLiteral, ExecCast, ExecBinaryPredicate, etc.) - ExecExprVisitor for type-safe traversal - ExecExprSerializer for ExecExpr → TExpr Thrift serialization - ExecExprExplain for EXPLAIN formatting - ScalarOperatorToExecExpr converter (optimizer ScalarOperator → ExecExpr) - ThriftEnumConverter for keys type / opcode conversions - ExprOpcodeRegistry for centralized TExprOpcode mappings Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 61b83df019
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| public boolean isNullable() { | ||
| return children.get(0).isNullable(); | ||
| } |
There was a problem hiding this comment.
Preserve cast nullability for non-compatible type conversions
ExecCast.isNullable() currently returns only the child nullability, but casts like VARCHAR -> INT can yield NULL even when the input expression is non-null (e.g., invalid numeric text). This marks those casts as non-null in serialized TExprNode.is_nullable, which can cause downstream plan/execution logic to skip required null handling and produce incorrect behavior.
Useful? React with 👍 / 👎.
| public boolean isSelfMonotonic() { | ||
| return true; |
There was a problem hiding this comment.
Mark cast nodes as non-monotonic unless proven safe
ExecCast.isSelfMonotonic() unconditionally returns true, but cast monotonicity is not guaranteed (for example, narrowing casts can overflow to NULL, which breaks monotonic order). Because ExecExprSerializer propagates this flag to TExprNode.is_monotonic, this can enable incorrect predicate/zone-map style pruning decisions and lead to wrong query results.
Useful? React with 👍 / 👎.
| public static boolean isBoundByTupleIds(ExecExpr expr, List<TupleId> tids) { | ||
| if (expr instanceof ExecSlotRef) { | ||
| ExecSlotRef slotRef = (ExecSlotRef) expr; | ||
| return tids.contains(slotRef.getTupleId()); |
There was a problem hiding this comment.
Handle lambda slot refs without tuple parents in bound checks
ExecExprUtils.isBoundByTupleIds() assumes every ExecSlotRef has a tuple parent and calls getTupleId() directly. However, ScalarOperatorToExecExpr.visitLambdaFunctionOperator() creates lambda argument slots from new SlotDescriptor(...) without assigning a parent tuple, so bound checks on lambda-containing expression trees can throw NullPointerException instead of returning a boolean.
Useful? React with 👍 / 👎.
108 tests covering: - ExecExpr type construction and properties (SlotRef, Literal, FunctionCall, Cast, BinaryPredicate) - ExecExprVisitor dispatch correctness - ExecExprSerializer: ExecExpr → TExpr serialization - ExecExprExplain: EXPLAIN and verbose formatting - ThriftEnumConverter: KeysType and opcode conversions - ExprOpcodeRegistry: opcode lookups for all operator types Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Increase FE incremental coverage from 26% to target by adding: - 126 new tests in ExecExprTest covering all 20 previously-uncovered ExecExpr subclasses (CompoundPredicate, LikePredicate, InPredicate, IsNullPredicate, MatchExpr, Arithmetic, CaseWhen, BetweenPredicate, ArrayExpr, ArraySlice, MapExpr, Clone, DictMapping, DictQuery, DictionaryGet, Subfield, LambdaFunction, PlaceHolder, InformationFunction) and ExecExprUtils - 40 new tests in ScalarOperatorToExecExprTest covering all visitor methods (ColumnRef, Constant, Call, Cast, BinaryPredicate, CompoundPredicate, InPredicate, IsNull, Like, CaseWhen, Array, Map, Lambda, Subfield, Between, buildIgnoreSlot) Total: 274 tests (234 ExecExprTest + 40 ScalarOperatorToExecExprTest) Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Add 135 more tests (369 total in ExecExprTest) targeting specific coverage gaps reported by CI: - ExecExprExplain: verbose/toSql/explain for ALL 22 ExecExpr types - ExecCollectionElement: construction, toThrift, visitor, serialize - ThriftEnumConverter: joinOperatorToThrift, assertionToThrift - ExecExprSerializer: serialize all remaining types (Compound, In, IsNull, Like, Arithmetic, CaseWhen, Match, Array, Map, Clone, Subfield, Lambda, PlaceHolder, InfoFunc, DictionaryGet, DictQuery) - ExecLiteral: BigInt, Float, Double, Decimal, Date, DateTime types - ExecExprVisitor: default fallback for all 22 visitor methods - ExecSlotRef: setNullable with descriptor, CHAR→VARCHAR conversion Total: 409 tests (369 ExecExprTest + 40 ScalarOperatorToExecExprTest) Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
[Java-Extensions Incremental Coverage Report]✅ pass : 0 / 0 (0%) |
[FE Incremental Coverage Report]✅ pass : 1240 / 1452 (85.40%) file detail
|
[BE Incremental Coverage Report]✅ pass : 0 / 0 (0%) |
|



Why I'm doing:
The current FE uses AST
Exprclasses for both parsing/analysis and physical plan execution. This tight coupling prevents moving expression classes (e.g.,FunctionCallExpr,SlotRef) to thefe-parsermodule and makes the planner depend on mutable AST state. A separate, typed, immutable expression hierarchy for the physical plan layer is needed.What I'm doing:
Introduce the
ExecExprhierarchy — a typed, immutable expression representation for the physical plan layer, fully decoupled from ASTExpr. This is a pure addition: 33 new files, zero modifications to existing code.New components:
ExecExprbase class + 28 concrete types (ExecSlotRef,ExecFunctionCall,ExecLiteral,ExecCast,ExecBinaryPredicate,ExecCompoundPredicate, etc.)ExecExprVisitor— type-safe visitor for traversalExecExprSerializer— serializesExecExpr→ ThriftTExprExecExprExplain— EXPLAIN formatting forExecExprScalarOperatorToExecExpr— converts optimizerScalarOperator→ExecExprThriftEnumConverter— centralized keys-type and opcode Thrift conversionsExprOpcodeRegistry— maps AST expression metadata toTExprOpcodeWhat type of PR is this:
Does this PR entail a change in behavior?
Checklist:
Bugfix cherry-pick branch check: