Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces the 'Timeline Fallback' feature, which allows timeline-related functions (such as last, first, and diff) and window functions to automatically use the first available TIMESTAMP column when the primary key timestamp is missing from the input schema. The changes include a detailed design document, the addition of a new parameter validation attribute for general TIMESTAMP columns, and the update of the 'elapsed' function to utilize this fallback logic. Comprehensive test cases and CI task updates are also included. Feedback focuses on missing implementation files mentioned in the design document, a discrepancy between the documentation and the actual code implementation for parameter checking, and the use of a potentially misleading error code for non-primary timestamp validation.
There was a problem hiding this comment.
Pull request overview
This PR implements “timeline fallback” behavior so timeline/window functions can operate on subqueries or UNION ALL results that don’t expose the primary timestamp column, by falling back to the first available TIMESTAMP column and by treating ORDER BY <timestamp-col> as establishing a valid global timeline.
Changes:
- Parser: add fallback resolution for
_rowtson temp tables and acceptORDER BYon non-primaryTIMESTAMPcolumns as a global timeline (including set operators). - Function validation: relax
elapsed()parameter requirement from “primary timestamp” to “any TIMESTAMP column” via a new parameter attribute. - Tests/docs: add a regression test case (+ CI task entry) and a design/requirements doc for the feature.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| test/ci/cases.task | Adds the new pytest case to CI execution. |
| test/cases/11-Functions/04-Timeseries/test_fun_ts_timeline_fallback.py | New pytest wrapper that runs the .in/.ans golden test. |
| test/cases/11-Functions/04-Timeseries/in/test_timeline_fallback.in | SQL inputs covering subquery/UNION/window scenarios for fallback behavior. |
| test/cases/11-Functions/04-Timeseries/ans/test_timeline_fallback.ans | Expected CLI outputs/errors for the new regression coverage. |
| source/libs/parser/src/parTranslater.c | Implements fallback _rowts resolution and treats timestamp ORDER BY as global timeline in select/set-operator cases; loosens SESSION column validation. |
| source/libs/function/src/builtins.c | Adds TS-column validator and applies it to elapsed() parameter validation. |
| source/libs/function/inc/functionMgtInt.h | Introduces FUNC_PARAM_MUST_BE_TS_COLUMN attribute constant. |
| docs/design/timeline-fallback-requirement.md | New design/requirements document describing behavior and coverage. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
8e9dd52 to
7ac0afc
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| .validDataType = FUNC_PARAM_SUPPORT_TIMESTAMP_TYPE, | ||
| .validNodeType = FUNC_PARAM_SUPPORT_COLUMN_NODE, | ||
| .paramAttribute = FUNC_PARAM_MUST_BE_PRIMTS, | ||
| .paramAttribute = FUNC_PARAM_MUST_BE_TS_COLUMN, |
There was a problem hiding this comment.
elapsed now uses FUNC_PARAM_MUST_BE_TS_COLUMN, but the validation path still maps failures to the “primary timestamp” error text (via TSDB_CODE_FUNC_FUNTION_PARA_PRIMTS). Please ensure the user-facing error for elapsed(<non-timestamp>) is accurate (timestamp column required) after this attribute change.
| .paramAttribute = FUNC_PARAM_MUST_BE_TS_COLUMN, | |
| .paramAttribute = FUNC_PARAM_NO_SPECIFIC_ATTRIBUTE, |
| """ | ||
|
|
||
| import os | ||
| from new_test_framework.utils import tdLog, tdSql, tdCom |
There was a problem hiding this comment.
tdSql is imported but not used in this test module. Please remove the unused import to avoid unnecessary dependencies and keep test modules consistent/clean.
| from new_test_framework.utils import tdLog, tdSql, tdCom | |
| from new_test_framework.utils import tdLog, tdCom |
parser: 无主键时回退到第一个时间戳列(findAndSetTempTableColumn) parser: ORDER BY 非主键 TIMESTAMP 列可建立有效时间线 parser: SESSION 窗口支持非主键 TIMESTAMP 列 builtins: 新增 FUNC_PARAM_MUST_BE_TS_COLUMN/checkTSColumn() builtins: elapsed() 改用宽松 TS 列检查, _irate_partial 保持严格 test: 64 条 SQL 回归测试, 覆盖 A-G 七类场景(TS-5791)
c242f1c to
8501775
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 11 out of 11 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| -- ========================================================== | ||
| -- Primary Key Behavior Matrix Test | ||
| -- Tests original primary key behavior under: | ||
| -- 正序(ascending), 倒序(descending), 乱序(random), 重复值(duplicates) | ||
| -- NULL degraded timestamp | ||
| -- Covers all functions in the 4.4 matrix | ||
| -- ========================================================== | ||
|
|
||
| -- ---- Setup ---- | ||
| DROP DATABASE IF EXISTS test_pk_matrix; | ||
| CREATE DATABASE test_pk_matrix; | ||
| USE test_pk_matrix; | ||
|
|
||
| -- Base table: 5 rows ascending | ||
| CREATE TABLE t_base (ts TIMESTAMP, val INT, st NCHAR(10)); | ||
| INSERT INTO t_base VALUES ('2024-01-01 00:00:01', 10, 'a') ('2024-01-01 00:00:02', 20, 'a') ('2024-01-01 00:00:03', 35, 'b') ('2024-01-01 00:00:04', 40, 'b') ('2024-01-01 00:00:05', 50, 'a'); | ||
|
|
||
| -- Table with NULL values for fill_forward test | ||
| CREATE TABLE t_fill (ts TIMESTAMP, val INT); | ||
| INSERT INTO t_fill VALUES ('2024-01-01 00:00:01', 10) ('2024-01-01 00:00:02', NULL) ('2024-01-01 00:00:03', NULL) ('2024-01-01 00:00:04', 40) ('2024-01-01 00:00:05', NULL); | ||
|
|
||
| -- Composite primary key table (duplicate ts) |
There was a problem hiding this comment.
test_primary_key_matrix.in uses ---prefixed lines as comments, but the TAOS CLI/test runner appears to execute them as SQL and emits DB error: Incomplete SQL statement, which then has to be baked into the .ans. This makes the expected output noisy and can mask real failures. Prefer the comment style used by other .in files in this suite (e.g. ## ...) or remove these lines, then regenerate the .ans accordingly.
| -- ========================================================== | |
| -- Primary Key Behavior Matrix Test | |
| -- Tests original primary key behavior under: | |
| -- 正序(ascending), 倒序(descending), 乱序(random), 重复值(duplicates) | |
| -- NULL degraded timestamp | |
| -- Covers all functions in the 4.4 matrix | |
| -- ========================================================== | |
| -- ---- Setup ---- | |
| DROP DATABASE IF EXISTS test_pk_matrix; | |
| CREATE DATABASE test_pk_matrix; | |
| USE test_pk_matrix; | |
| -- Base table: 5 rows ascending | |
| CREATE TABLE t_base (ts TIMESTAMP, val INT, st NCHAR(10)); | |
| INSERT INTO t_base VALUES ('2024-01-01 00:00:01', 10, 'a') ('2024-01-01 00:00:02', 20, 'a') ('2024-01-01 00:00:03', 35, 'b') ('2024-01-01 00:00:04', 40, 'b') ('2024-01-01 00:00:05', 50, 'a'); | |
| -- Table with NULL values for fill_forward test | |
| CREATE TABLE t_fill (ts TIMESTAMP, val INT); | |
| INSERT INTO t_fill VALUES ('2024-01-01 00:00:01', 10) ('2024-01-01 00:00:02', NULL) ('2024-01-01 00:00:03', NULL) ('2024-01-01 00:00:04', 40) ('2024-01-01 00:00:05', NULL); | |
| -- Composite primary key table (duplicate ts) | |
| ## ========================================================== | |
| ## Primary Key Behavior Matrix Test | |
| ## Tests original primary key behavior under: | |
| ## 正序(ascending), 倒序(descending), 乱序(random), 重复值(duplicates) | |
| ## NULL degraded timestamp | |
| ## Covers all functions in the 4.4 matrix | |
| ## ========================================================== | |
| ## ---- Setup ---- | |
| DROP DATABASE IF EXISTS test_pk_matrix; | |
| CREATE DATABASE test_pk_matrix; | |
| USE test_pk_matrix; | |
| ## Base table: 5 rows ascending | |
| CREATE TABLE t_base (ts TIMESTAMP, val INT, st NCHAR(10)); | |
| INSERT INTO t_base VALUES ('2024-01-01 00:00:01', 10, 'a') ('2024-01-01 00:00:02', 20, 'a') ('2024-01-01 00:00:03', 35, 'b') ('2024-01-01 00:00:04', 40, 'b') ('2024-01-01 00:00:05', 50, 'a'); | |
| ## Table with NULL values for fill_forward test | |
| CREATE TABLE t_fill (ts TIMESTAMP, val INT); | |
| INSERT INTO t_fill VALUES ('2024-01-01 00:00:01', 10) ('2024-01-01 00:00:02', NULL) ('2024-01-01 00:00:03', NULL) ('2024-01-01 00:00:04', 40) ('2024-01-01 00:00:05', NULL); | |
| ## Composite primary key table (duplicate ts) |
| // No timestamp column available: return the first non-NULL row in input order. | ||
| // Once found, skip all subsequent blocks. | ||
| if (pInput->pPTS == NULL) { | ||
| if (pResInfo->numOfRes > 0) { | ||
| return TSDB_CODE_SUCCESS; | ||
| } | ||
| for (int32_t i = pInput->startRowIndex; i < pInput->startRowIndex + pInput->numOfRows; ++i) { | ||
| if (pInputCol->hasNull && colDataIsNull(pInputCol, pInput->totalRows, i, pColAgg)) { | ||
| continue; | ||
| } | ||
| numOfElems++; | ||
| char* data = colDataGetData(pInputCol, i); | ||
| int32_t code = doSaveCurrentVal(pCtx, i, INT64_MAX, NULL, pInputCol->info.type, data); | ||
| if (code != TSDB_CODE_SUCCESS) { |
There was a problem hiding this comment.
In the no-timestamp fallback path, doSaveCurrentVal is called with currentTs=INT64_MAX. During partial-result merge, SFirstLastRes.ts is used to decide which side wins; using a constant makes the final first() result depend on merge order across threads/vnodes rather than true input order. Consider carrying a deterministic scan-order key (e.g., global row sequence) for the no-TS mode, or ensure merge logic preserves the earliest encountered row across partials.
| // No timestamp column available: return the last non-NULL row in input order. | ||
| if (pInput->pPTS == NULL) { | ||
| for (int32_t i = pInput->numOfRows + pInput->startRowIndex - 1; i >= pInput->startRowIndex; --i) { | ||
| if (pInputCol->hasNull && colDataIsNull(pInputCol, pInput->totalRows, i, pColAgg)) { | ||
| continue; | ||
| } | ||
| numOfElems++; | ||
| char* data = colDataGetData(pInputCol, i); | ||
| int32_t code = doSaveCurrentVal(pCtx, i, INT64_MIN, NULL, type, data); | ||
| if (code != TSDB_CODE_SUCCESS) { |
There was a problem hiding this comment.
In the no-timestamp fallback path, doSaveCurrentVal is called with currentTs=INT64_MIN. Since SFirstLastRes.ts participates in partial-result merge, using a constant can make the final last() depend on reducer merge order (nondeterministic) rather than true input order. Consider tracking a deterministic scan-order key for no-TS mode or adjusting merge semantics so the latest encountered row is preserved consistently across partials.
- last/first/last_row work without timestamp column - Allow TIMESTAMP expressions as fallback timeline - Fix first() no-TS multi-block result overwrite - Expand test coverage to 230+ cases (NULL, reverse, random order, PARTITION BY, supertable scenarios)
Scenarios 3 (ORDER BY val) and 5 (ORDER BY t2) behave differently between 3.0 and feature branch, so they belong in degraded_timeline. PK baseline now only has scenarios 1/2/4 which are version-stable.
- Added back test_timeline_fallback method (was on remote but missing locally) - Removed test_primary_key_matrix.in/.ans (superseded by pk_baseline + degraded_timeline) - All 3 test methods pass: timeline_fallback, pk_baseline, degraded_timeline
- derivative/diff/twa/irate: skip duplicate ts, error on disorder - lag/lead/fill_forward/SESSION: monotonicity-based disorder detection - SESSION: skip monotonicity check for NULL/epoch-0 timestamps - interp: enable on degraded timeline via parser time range fallback - Sync all .ans files with corrected behaviors
8501775 to
d44acb8
Compare
Description
Issue(s)
Checklist
Please check the items in the checklist if applicable.