Skip to content

[fix](mv) Invalidate rewrite cache on constraint changes#62530

Open
foxtail463 wants to merge 1 commit intoapache:masterfrom
foxtail463:mv_constraint_rewrite_cache_invalidate
Open

[fix](mv) Invalidate rewrite cache on constraint changes#62530
foxtail463 wants to merge 1 commit intoapache:masterfrom
foxtail463:mv_constraint_rewrite_cache_invalidate

Conversation

@foxtail463
Copy link
Copy Markdown
Contributor

What problem does this PR solve?

Problem Summary:

Constraint changes can change MV rewrite eligibility, especially for PK/FK/UK-based reasoning.

Example:

-- MV
SELECT o.o_orderkey
FROM orders o
INNER JOIN lineitem l
  ON o.o_orderkey = l.l_orderkey

Without the relevant PK/FK metadata, the rewrite may fail.

After:

ALTER TABLE lineitem ADD CONSTRAINT lineitem_pk PRIMARY KEY (l_orderkey);
ALTER TABLE orders ADD CONSTRAINT orders_fk
    FOREIGN KEY (o_orderkey) REFERENCES lineitem(l_orderkey);

the same query may become rewritable because the optimizer can prove the join relationship through constraints.

After dropping either the FK or the referenced PK, that reasoning is no longer valid, so the rewrite should stop again.

This patch invalidates dependent MTMV rewrite caches after ADD/DROP CONSTRAINT. The invalidation is driven by the analyzed table name and affected base-table infos, so it does not depend on rewrite cache contents that may have been built from old constraint metadata.

It also prevents an in-flight rewrite cache built before the constraint change from being published after invalidation, avoiding stale PK/FK/UK metadata from being reused by later rewrites.

@foxtail463
Copy link
Copy Markdown
Contributor Author

/review

@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

Copy link
Copy Markdown
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocking finding:

  1. fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/DropConstraintCommand.java and fe/fe-core/src/main/java/org/apache/doris/persist/EditLog.java
    The DROP CONSTRAINT flow removes the constraint from ConstraintManager before invalidating dependent MTMV rewrite caches. MTMV.getOrGenerateCache() returns any existing cache immediately, so a concurrent planner in that window can still reuse a cache built with the old PK/FK/UK metadata and accept a rewrite that should already be disallowed. This affects both the live DDL path and journal replay.

Critical checkpoint conclusions:

  • Goal of the task: Partially accomplished. The PR correctly identifies that constraint changes must invalidate dependent MTMV rewrite caches, and it adds coverage for cache rebuild/in-flight publication, but DROP CONSTRAINT still leaves a stale-cache visibility window.
  • Small, clear, focused change: Yes. The patch is scoped to cache invalidation around constraint changes plus tests.
  • Concurrency: The new generation counter in MTMV is a good fix for in-flight cache publication, but cache invalidation is not atomic with the metadata change on the drop path, so concurrent query planning can still observe stale rewrite metadata. I did not find a new lock-order or deadlock issue in the added code.
  • Lifecycle / static initialization: No special lifecycle or static-init issue found in the touched code.
  • Configuration changes: None.
  • Incompatible changes / compatibility: No FE-BE protocol or storage-format compatibility issue introduced. Edit-log write/replay coverage exists, but replay inherits the same stale-cache ordering bug.
  • Parallel code paths: The live command path and replay path were both updated, and both contain the same ordering problem.
  • Special conditional checks: The new null/unknown-constraint checks are reasonable.
  • Test coverage: Improved with FE unit tests and a regression test, but there is still no test covering the concurrent DROP CONSTRAINT stale-cache window.
  • Test result modifications: Added regression case looks aligned with MTMV behavior expectations; I did not execute the suite in this runner.
  • Observability: No additional observability appears necessary for this change.
  • Transaction / persistence: Metadata write and replay paths are both handled, but replay currently does not preserve the intended invalidation semantics under stale cached MTMVs.
  • Data writes / atomicity: This is metadata-only, but the cache invalidation and constraint-state transition are not atomically visible to readers on DROP CONSTRAINT.
  • FE/BE variable passing: Not applicable.
  • Performance: No obvious new hot-path performance issue.
  • Other issues: None beyond the blocking correctness issue above.

Comment thread fe/fe-core/src/main/java/org/apache/doris/persist/EditLog.java
@foxtail463
Copy link
Copy Markdown
Contributor Author

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 80.49% (99/123) 🎉
Increment coverage report
Complete coverage report

@foxtail463
Copy link
Copy Markdown
Contributor Author

run feut
run compile
run external

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants