Skip to content

branch-4.1: pick #61696 #61621 #61222 #62596#62707

Open
Yukang-Lian wants to merge 4 commits intoapache:branch-4.1from
Yukang-Lian:lianyukang/pick-4.1-61696-61621-61222-62596
Open

branch-4.1: pick #61696 #61621 #61222 #62596#62707
Yukang-Lian wants to merge 4 commits intoapache:branch-4.1from
Yukang-Lian:lianyukang/pick-4.1-61696-61621-61222-62596

Conversation

@Yukang-Lian
Copy link
Copy Markdown
Collaborator

Cherry-pick the following PRs to branch-4.1:

Conflicts resolved for #61696 (schema table enum renumbering on branch-4.1; be_compaction_tasks uses id 67) and #61621 (AuditLogHelper context + QueueToken import).

… HTTP API (apache#61696)

Introduce `CompactionTaskTracker` to provide **full lifecycle
observability** for base/cumulative/full compaction tasks. This PR adds
two query channels:

1. **`information_schema.be_compaction_tasks` system table** (38
columns) — query via SQL across all BEs
2. **`GET /api/compaction/profile` HTTP API** — query via JSON on a
single BE, with filtering support

Currently, compaction execution metrics (input/output data sizes, row
counts, merge latency, etc.) are lost once the compaction object is
destructed. Operators have no way to inspect historical compaction
performance — only the current running status is available. Users also
cannot see pending or running compaction task details through SQL.

This PR solves both problems with a unified `CompactionTaskTracker`
mechanism that tracks compaction tasks across their full lifecycle:
**PENDING → RUNNING → FINISHED/FAILED**.

```
                         ┌── Push ──┐
  Compaction entry points ──→ CompactionTaskTracker (in-memory) ←── Push ── Compaction base class
   (register_task)            │  _active_tasks (PENDING/RUNNING)          (complete/fail)
                              │  _completed_tasks (FINISHED/FAILED)
                              │
               ┌──── Pull ────┴──── Pull ────┐
               ↓                              ↓
   SchemaCompactionTasksScanner     CompactionProfileAction
    get_all_tasks() snapshot         get_completed_tasks() filtered
               ↓                              ↓
   FE fan-out to all BEs → SQL      JSON HTTP Response
```

- **Push-based collection**: Compaction entry points and execution layer
push task info at lifecycle boundaries
- **Pull-based querying**: System table scanner and HTTP action pull
snapshots with read locks, never blocking compaction
- **Pure in-memory storage**: Two containers, no persistence. Data
clears on BE restart.

---

| Column | Type | Description |
|--------|------|-------------|
| BACKEND_ID | BIGINT | BE node ID |
| COMPACTION_ID | BIGINT | Unique task ID |
| TABLE_ID | BIGINT | Table ID |
| PARTITION_ID | BIGINT | Partition ID |
| TABLET_ID | BIGINT | Tablet ID |
| COMPACTION_TYPE | VARCHAR | `base` / `cumulative` / `full` |
| STATUS | VARCHAR | `PENDING` / `RUNNING` / `FINISHED` / `FAILED` |
| TRIGGER_METHOD | VARCHAR | `AUTO` / `MANUAL` / `LOAD_TRIGGERED` |
| COMPACTION_SCORE | BIGINT | Tablet compaction score at registration
time |
| SCHEDULED_TIME | DATETIME | Task registration time |
| START_TIME | DATETIME | Execution start time (NULL when PENDING) |
| END_TIME | DATETIME | Execution end time (NULL when not completed) |
| ELAPSED_TIME_MS | BIGINT | RUNNING: real-time `now - start_time`;
FINISHED/FAILED: `end - start` |
| INPUT_ROWSETS_COUNT | BIGINT | Number of input rowsets |
| INPUT_ROW_NUM | BIGINT | Number of input rows |
| INPUT_DATA_SIZE | BIGINT | Input data size (bytes) |
| INPUT_INDEX_SIZE | BIGINT | Input index size (bytes) |
| INPUT_TOTAL_SIZE | BIGINT | Input total size (data + index) |
| INPUT_SEGMENTS_NUM | BIGINT | Number of input segments |
| INPUT_VERSION_RANGE | VARCHAR | Input version range, e.g., `[0-5]` |
| MERGED_ROWS | BIGINT | Merged row count |
| FILTERED_ROWS | BIGINT | Filtered row count |
| OUTPUT_ROWS | BIGINT | Merger output rows (0 for ordered compaction) |
| OUTPUT_ROW_NUM | BIGINT | Output rowset row count |
| OUTPUT_DATA_SIZE | BIGINT | Output data size (bytes) |
| OUTPUT_INDEX_SIZE | BIGINT | Output index size (bytes) |
| OUTPUT_TOTAL_SIZE | BIGINT | Output total size (data + index) |
| OUTPUT_SEGMENTS_NUM | BIGINT | Number of output segments |
| OUTPUT_VERSION | VARCHAR | Output version range |
| MERGE_LATENCY_MS | BIGINT | Merge phase latency in ms (0 for ordered
compaction) |
| BYTES_READ_FROM_LOCAL | BIGINT | Bytes read from local storage |
| BYTES_READ_FROM_REMOTE | BIGINT | Bytes read from remote storage |
| PEAK_MEMORY_BYTES | BIGINT | Peak memory usage (bytes) |
| IS_VERTICAL | BOOLEAN | Whether vertical compaction was used |
| PERMITS | BIGINT | Compaction permits consumed |
| VERTICAL_TOTAL_GROUPS | BIGINT | Vertical compaction total column
groups (0 if horizontal) |
| VERTICAL_COMPLETED_GROUPS | BIGINT | Completed column groups
(real-time during RUNNING) |
| STATUS_MSG | VARCHAR | Error message (empty on success) |

**1. View running compaction tasks**

```sql
SELECT BACKEND_ID, TABLET_ID, COMPACTION_TYPE, ELAPSED_TIME_MS, INPUT_DATA_SIZE
FROM information_schema.be_compaction_tasks
WHERE STATUS = 'RUNNING' ORDER BY ELAPSED_TIME_MS DESC;
```

```
+------------+-----------+-----------------+-----------------+-----------------+
| BACKEND_ID | TABLET_ID | COMPACTION_TYPE | ELAPSED_TIME_MS | INPUT_DATA_SIZE |
+------------+-----------+-----------------+-----------------+-----------------+
|      10001 |    123456 | base            |           35210 |       524288000 |
|      10002 |    789012 | cumulative      |            1250 |        10485760 |
+------------+-----------+-----------------+-----------------+-----------------+
```

**2. Track vertical compaction progress**

```sql
SELECT TABLET_ID, VERTICAL_COMPLETED_GROUPS AS done, VERTICAL_TOTAL_GROUPS AS total,
       CONCAT(ROUND(VERTICAL_COMPLETED_GROUPS * 100.0 / NULLIF(VERTICAL_TOTAL_GROUPS, 0), 1), '%') AS progress
FROM information_schema.be_compaction_tasks
WHERE STATUS = 'RUNNING' AND IS_VERTICAL = true;
```

```
+-----------+------+-------+----------+
| TABLET_ID | done | total | progress |
+-----------+------+-------+----------+
|    123456 |    2 |     6 | 33.3%    |
|    234567 |    5 |     8 | 62.5%    |
+-----------+------+-------+----------+
```

**3. Find the slowest finished compactions**

```sql
SELECT TABLET_ID, COMPACTION_TYPE, ELAPSED_TIME_MS, INPUT_DATA_SIZE, OUTPUT_DATA_SIZE
FROM information_schema.be_compaction_tasks
WHERE STATUS = 'FINISHED' ORDER BY ELAPSED_TIME_MS DESC LIMIT 3;
```

```
+-----------+-----------------+-----------------+-----------------+------------------+
| TABLET_ID | COMPACTION_TYPE | ELAPSED_TIME_MS | INPUT_DATA_SIZE | OUTPUT_DATA_SIZE |
+-----------+-----------------+-----------------+-----------------+------------------+
|    123456 | base            |           42000 |       524288000 |        210000000 |
|    345678 | cumulative      |            8500 |        30000000 |         15000000 |
|    456789 | full            |            5200 |        20000000 |         12000000 |
+-----------+-----------------+-----------------+-----------------+------------------+
```

**4. Compaction count by trigger method**

```sql
SELECT TRIGGER_METHOD, COUNT(*) AS cnt
FROM information_schema.be_compaction_tasks
WHERE STATUS = 'FINISHED' GROUP BY TRIGGER_METHOD;
```

```
+----------------+-----+
| TRIGGER_METHOD | cnt |
+----------------+-----+
| AUTO           | 847 |
| MANUAL         |  12 |
| LOAD_TRIGGERED |  53 |
+----------------+-----+
```

---

Queries completed compaction profiles on a **single BE** with optional
filtering. All parameters can be combined (AND logic).

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `tablet_id` | int64 | No | Filter by tablet ID |
| `top_n` | int64 | No | Return most recent N records |
| `compact_type` | string | No | Filter by type: `base` / `cumulative` /
`full` |
| `success` | string | No | Filter by result: `true` / `false` |

```json
{
  "status": "Success",
  "compaction_profiles": [
    {
      "compaction_id": 487,
      "compaction_type": "cumulative",
      "tablet_id": 12345,
      "table_id": 67890,
      "partition_id": 11111,
      "trigger_method": "AUTO",
      "compaction_score": 10,
      "scheduled_time": "2025-07-15 14:02:30",
      "start_time": "2025-07-15 14:02:31",
      "end_time": "2025-07-15 14:02:31",
      "cost_time_ms": 236,
      "success": true,
      "input_rowsets_count": 5,
      "input_row_num": 52000,
      "input_data_size": 10706329,
      "input_index_size": 204800,
      "input_total_size": 10911129,
      "input_segments_num": 5,
      "input_version_range": "[12-16]",
      "merged_rows": 1200,
      "filtered_rows": 50,
      "output_rows": 50750,
      "output_row_num": 50750,
      "output_data_size": 5033164,
      "output_index_size": 102400,
      "output_total_size": 5135564,
      "output_segments_num": 1,
      "output_version": "[12-16]",
      "merge_latency_ms": 180,
      "bytes_read_from_local": 10706329,
      "bytes_read_from_remote": 0,
      "peak_memory_bytes": 33554432,
      "is_vertical": true,
      "permits": 10706329,
      "vertical_total_groups": 4,
      "vertical_completed_groups": 4
    }
  ]
}
```

Failed compactions additionally include `"status_msg": "error message"`.

```bash
curl "http://BE_HOST:BE_HTTP_PORT/api/compaction/profile?top_n=10"

curl "http://BE_HOST:BE_HTTP_PORT/api/compaction/profile?tablet_id=12345&top_n=5"

curl "http://BE_HOST:BE_HTTP_PORT/api/compaction/profile?compact_type=base&success=false"

curl "http://BE_HOST:BE_HTTP_PORT/api/compaction/profile?tablet_id=12345&top_n=3&compact_type=cumulative&success=false"
```

---

| Config | Type | Default | Description |
|--------|------|---------|-------------|
| `enable_compaction_task_tracker` | mutable bool | `true` | Master
switch. When disabled, all push operations are no-op, queries return
empty. |
| `compaction_task_tracker_max_records` | mutable int32 | `10000` | Max
completed task records kept in memory (~500 bytes/record, default ~5MB).
|

---

- **Tracked**: base / cumulative / full compaction (local + cloud mode)
- **Not tracked**: single_replica, cold_data, index_change compaction
(low frequency, limited diagnostic value)
- **Trigger methods**: AUTO (background scheduling), MANUAL (HTTP API),
LOAD_TRIGGERED (load-time cumulative)

---

- [x] 15 unit tests (`CompactionTaskTrackerTest.*`) covering full
lifecycle, failure paths, TRY_LOCK_FAILED cleanup, vertical progress,
concurrent safety, config switch, filters
- [x] Regression test: `test_be_compaction_tasks` (system table)
- [x] Regression test: `test_compaction_profile_action` (HTTP API)
…dit log disabled (apache#61621)

- When `enable_prepared_stmt_audit_log = false` (default), `auditAfterExec()` is entirely skipped for prepared statements. Since QPS metric counting lives inside `AuditLogHelper.logAuditLog()`, these metrics are also lost for prepared statement SELECT queries.
- Add `AuditLogHelper.updateMetrics()` to count metrics independently of audit log writing, and call it in the `else` branch when audit log is disabled.
- This decouples the metric counting from audit log writing without modifying the existing `logAuditLogImpl()` logic.

- **AuditLogHelper.java**: Add `updateMetrics()` + `updateMetricsImpl()` methods for standalone metric counting
- **MysqlConnectProcessor.java**: Add `else` branch in `handleExecute()` to call `updateMetrics()` when audit log is disabled.
- **AuditLogHelperTest.java**: Unit tests covering OK query, ERR query, non-query (INSERT), and debug mode short-circuit scenarios.

- [x] Unit tests pass (`AuditLogHelperTest` - 4 cases)
…ool instead of detached thread (apache#61222)

Previously, manually triggering full compaction on a single tablet via
HTTP API would create a detached thread for each request, which lacks
concurrency control, deduplication, and metrics tracking.

This PR changes the single-tablet full compaction path to use
`submit_compaction_task()`, submitting to the base compaction thread
pool — consistent with the table-level path and the cloud engine. Base
and cumulative compaction remain unchanged.

Added `force` HTTP parameter to skip permit limiter when submitting full
compaction, allowing compaction to proceed even when resources are
constrained.

### Usage

```bash
# single tablet, default (with permit limiter)
curl -X POST "http://be_host:http_port/api/compaction/run?tablet_id=12345&compact_type=full"

# single tablet, force (skip permit limiter)
curl -X POST "http://be_host:http_port/api/compaction/run?tablet_id=12345&compact_type=full&force=true"

# table level
curl -X POST "http://be_host:http_port/api/compaction/run?table_id=67890&compact_type=full"

# table level, force
curl -X POST "http://be_host:http_port/api/compaction/run?table_id=67890&compact_type=full&force=true"
```
… it in cloud mode (apache#62596)

## Summary

Unify the ``ADMIN COMPACT TABLE ... WHERE type = '...'`` SQL entry so
that BASE, CUMULATIVE and FULL all work on both the local and cloud
(storage-compute split) engines. Before this change, FULL was rejected
in FE and cloud mode was blocked outright, forcing users to curl each
BE's HTTP API to run manual compaction.

## What changes

### FE

- ``AdminCompactTableCommand``: add ``FULL`` to ``CompactionType``,
relax ``analyzeWhere``, map ``FULL -> \"full\"`` in
``getCompactionType``, and drop the cloud-mode gate so the command
dispatches via the normal Agent RPC path.
- ``CloudEnv.compactTable``: override ``Env.compactTable`` to walk
partitions/indices/tablets, resolve the primary BE via
``Replica.getBackendId()``, skip tablets whose compute group has no
available BE, and raise ``DdlException`` if nothing was dispatched.

### BE

- ``submit_table_compaction_callback``: strictly match
``base``/``cumulative``/``full`` instead of silently downgrading any
non-``\"base\"`` value to ``CUMULATIVE_COMPACTION``. FULL goes through
``submit_compaction_task(force=false, eager=true,
trigger_method=MANUAL)`` after setting
``last_full_compaction_schedule_time``, matching the HTTP API default.
- ``cloud_submit_table_compaction_callback``: new callback mirroring the
local one but calling ``CloudStorageEngine::submit_compaction_task``;
honors the cloud HTTP convention that base/cumu need
``sync_delete_bitmap`` while full does not.
- ``agent_server::cloud_start_workers``: register the previously
stubbed-out ``SUBMIT_TABLE_COMPACTION`` worker (drops the ``//
TODO(plat1ko)`` comment).

### Thrift / wire compatibility

``TCompactionReq.type`` remains ``optional string``, so no IDL change. A
new FE against an old BE would see ``\"full\"`` silently downgraded
(local) or dropped (cloud), so the release order is **all BEs upgraded
first, then FE**. FE rollback can lead BE since the FE simply stops
sending ``\"full\"``.

## Test plan

- [x] ``mvn -pl fe-core -am compile`` passes.
- [x] New regression suite
``regression-test/suites/compaction/test_admin_compact_table.groovy``
covers base / cumulative / full end-to-end on the local engine, plus
negative cases (unknown type, missing WHERE).
- [x] New regression suite
``regression-test/suites/cloud_p0/compaction/test_cloud_admin_compact_table.groovy``
covers the same matrix on the cloud engine.
- [x] Design document and code reviewed independently by Codex (two
rounds).
@Yukang-Lian Yukang-Lian requested a review from yiguolei as a code owner April 22, 2026 08:46
@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@Yukang-Lian
Copy link
Copy Markdown
Collaborator Author

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 78.52% (1798/2290)
Line Coverage 64.32% (32369/50323)
Region Coverage 65.19% (16300/25003)
Branch Coverage 55.75% (8721/15644)

@hello-stephen
Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 77.99% (124/159) 🎉
Increment coverage report
Complete coverage report

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 82.26% (983/1195) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 71.52% (26299/36770)
Line Coverage 54.48% (278906/511947)
Region Coverage 51.79% (231668/447297)
Branch Coverage 53.15% (100109/188359)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants