Run a mirror when its configuration is updated#1247
Conversation
Motivation: See line#1183 for details. Modifications: - Add `remoteRevision`, `remotePath`, `localRevision` and `localPath` fields to `.mirror_state.json` to capture the configuration used when a mirror is run. - Move `RepositoryUri` to public API and expose it in `Mirror.java` to allow access to the full remote URL in mirror tasks. - Set the depth of Git fetch to 2 to read parent commit. - This is required to detect changes made on the remote in local-to-remote mode. Result: - Git mirroring now runs correctly when the configuration changes or the mirrored content is manually overridden. - Closes line#1183
📝 WalkthroughWalkthroughUnifies remote identification into RepositoryUri, expands MirrorState to track per-path remote/local revisions and configHash, refactors Git mirror decision and state logic (fetch depth, previous-commit handling, RUN/SKIP/COMPARE decisions), and updates providers and tests to the new API and per-path expectations. Changes
Sequence Diagram(s)sequenceDiagram
participant Scheduler as Scheduler/Invoker
participant Mirror as AbstractGitMirror
participant Local as LocalRepo
participant Remote as Remote Git/Repo
participant State as mirror_state.json
Scheduler->>Mirror: trigger mirrorLocalToRemote()
activate Mirror
Mirror->>Local: read head commit and parent (fetch depth=2)
Mirror->>Remote: fetch head commit (depth=2)
Mirror->>Mirror: localCurrentMirrorState(mirror_state.json)
Mirror->>Mirror: remoteCurrentMirrorState(remote tree)
Mirror->>Mirror: shouldRunLocalToRemote(oldState, localHead, previousRemoteCommit)
alt Decision = RUN / COMPARE_AND_RUN
Mirror->>Remote: push per-path changes
Mirror->>State: write updated mirror_state.json (per-path + configHash)
Mirror-->>Scheduler: mirror completed (updated)
else Decision = SKIP
Mirror-->>Scheduler: skip — up-to-date
end
deactivate Mirror
sequenceDiagram
participant Scheduler as Scheduler/Invoker
participant Mirror as AbstractGitMirror
participant Remote as Remote Git/Repo
participant Local as LocalRepo
participant State as mirror_state.json
Scheduler->>Mirror: trigger mirrorRemoteToLocal()
activate Mirror
Mirror->>Remote: fetch head commit (depth=2)
Mirror->>Mirror: remoteCurrentMirrorState(remote tree)
Mirror->>Local: read local mirror_state.json
Mirror->>Mirror: shouldRunRemoteToLocal(oldState, previousLocalHead, remoteCommit)
alt Decision = RUN / COMPARE_AND_RUN
Mirror->>Local: apply remote per-path changes
Mirror->>State: write updated mirror_state.json (per-path + configHash)
Mirror-->>Scheduler: mirror completed (updated)
else Decision = SKIP
Mirror-->>Scheduler: skip — up-to-date
end
deactivate Mirror
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
| * | ||
| * @deprecated Use {@link #remoteRevision()} or {@link #localRevision()} instead. | ||
| */ | ||
| @Deprecated |
There was a problem hiding this comment.
nit; the proposed logic always gets the previous revision of the target (local for remote2local, remote for local2remote).
It might be easier to understand if a separate previousTargetRevision() method is added.
| } | ||
| } | ||
|
|
||
| boolean shouldRunMirror(@Nullable MirrorState oldMirrorState, Revision localHead, |
There was a problem hiding this comment.
Question) If there are multiple mirrors configured for a single repository, even if a file in directory A is changed will all mirrors be triggered due to also looking at the target revision?
I wonder if adding a hash of the mirror configuration for each mirror to MirrorState is a simpler/safer approach
There was a problem hiding this comment.
This is an important point. I hadn't considered that multiple mirrors can be configured for a single repository.
if adding a hash of the mirror configuration for each mirror to MirrorState is a simpler/safer approach
A mirror configuration hash alone can't tell us whether we should run the mirror when there has been a manual push to the local repo. However, it would be worth adding the hash to detect any changes in mirror configuration.
Although, the logic is complex, the following approach seems to be the most appropriate under the Central Dogma revision model which does not provide the HEAD revision for a directory or a file.
Proposal: For remote-to-local mirroring, it should be safe to rely on the local HEAD when the local path is the root (/). If the local path is non-root (e.g, /subdir) and previousLocalRevision + 1 doesn't match the current HEAD, we could run repository.diff(previousLocalRevision, headRevision, "/subdir") to check whether there are actual changes happened before running a mirror task.
There was a problem hiding this comment.
if adding a hash of the mirror configuration for each mirror to MirrorState is a simpler/safer approach
I interpreted the original issue as mirror doesn't run properly when the path (MirrorState) changes. I didn't think it was necessary that the revision of the target is considered for mirroring.
Whatever approach is taken, I'm fine as long as it's consistent for both directions.
There was a problem hiding this comment.
I made new changes to properly overwrite a manual push in the next mirror job. PTAL.
| return MirrorDecision.RUN; | ||
| } | ||
|
|
||
| if (!localHead.text().equals(previousLocalRevision)) { |
There was a problem hiding this comment.
Question) I may have not understood as the logic is complex, but is localHead ever equal to previousLocalRevision given that
localHeadis the currentHEADrevisionpreviousLocalRevisionis from theMirrorState, which means it is the previous local revision?
There was a problem hiding this comment.
Local-to-remote mirroring pushes a new commit that includes mirror_state.json to the remote repository which does not increase the local head revision. If no commits are pushed to either the local or remote repository, previousLocalRevision equals localHead in next mirroring jobs.
| @JsonCreator | ||
| MirrorState(@JsonProperty(value = "sourceRevision", required = true) String sourceRevision) { | ||
| public MirrorState(@JsonProperty(value = "sourceRevision", required = true) String sourceRevision, | ||
| @JsonProperty("remoteRevision") @Nullable String remoteRevision, |
There was a problem hiding this comment.
Understood that:
- For
localToRemoteremoteRevisionis the revision before the mirror push (so the previous version)localRevisionis the previous revision before mirror push
- For
remoteToLocalremoteRevisionis the current revision (since no push is done to remote)localRevisionis the previous revision before mirror push
It might be worth documenting this somewhere since I don't think this is well expressed in the currently proposed code.
There was a problem hiding this comment.
remoteRevision and localRevision represent the revisions at the time a mirroring job updates a change.
- For local-to-remove, a new commit is pushed to the remote repository.
remoteRevisionindicates the previous head revision (e.g, commit sha236)localRevisionstill equals to the local HEAD.
- For remote-to-local, a new commit is pushed to the local repository.
localRevisionindicates the previous head revision (1234)remoteRevisionstill equals to the remote HEAD.
I will document this behavior in the source code.
...or-git/src/main/java/com/linecorp/centraldogma/server/internal/mirror/AbstractGitMirror.java
Outdated
Show resolved
Hide resolved
1eb6abb to
14625d4
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
server-mirror-git/src/main/java/com/linecorp/centraldogma/server/internal/mirror/AbstractGitMirror.java (1)
705-792:⚠️ Potential issue | 🟠 MajorEmpty-content updates can be missed in compare mode.
hasChangesis set only whencontentLength > 0. If a file changes to empty content (length 0),applyPathEditreturns0, sohasChangesstays false andCOMPARE_AND_RUNmay incorrectly skip despite real changes. Use a sentinel for “no change” or a boolean.🐛 Proposed fix (distinguish “no change” vs empty content)
- final long contentLength = applyPathEdit(dirCache, inserter, pathString, entry, oldContent); - if (contentLength > 0) { - hasChanges = true; - } - numBytes += contentLength; + final long contentLength = applyPathEdit(dirCache, inserter, pathString, entry, oldContent); + if (contentLength >= 0) { + hasChanges = true; + numBytes += contentLength; + }- final long contentLength = applyPathEdit(dirCache, inserter, convertedPath, value, null); - if (contentLength > 0) { - hasChanges = true; - } - numBytes += contentLength; + final long contentLength = applyPathEdit(dirCache, inserter, convertedPath, value, null); + if (contentLength >= 0) { + hasChanges = true; + numBytes += contentLength; + }- return 0; + return -1; // No change
🤖 Fix all issues with AI agents
In
`@server-mirror-git/src/main/java/com/linecorp/centraldogma/server/internal/mirror/AbstractGitMirror.java`:
- Around line 262-268: The current logic leaves previousCommitId null for
orphan/root HEADs, causing downstream change-detection to ignore remote manual
pushes; modify the block in AbstractGitMirror around RevCommit headCommit so
that if headCommit.getParentCount() == 0 you set previousCommitId =
headCommit.getId() (i.e., treat the HEAD commit as the baseline for comparison)
instead of leaving it null; ensure you still call revWalk.parseHeaders(...)
appropriately for non-orphan parents and only use headCommit.getId() as the
fallback baseline used by the change-detection logic that reads
previousCommitId.
🧹 Nitpick comments (2)
server-mirror-git/src/main/java/com/linecorp/centraldogma/server/internal/mirror/AbstractGitMirror.java (2)
903-905: Consider usingremoteUri()in the exception message for full context.
This keeps error reporting consistent with other updated messages.
313-318: Consolidate config hash computation with inheritedhashString()method.
Line 313 duplicates the hashing logic already defined in the parent class, which also provides field-level caching for efficiency.♻️ Suggested change
- final String configHash = Hashing.sha256().hashString(toString(), UTF_8).toString(); + final String configHash = hashString();
Motivation:
See #1183 for details.
Modifications:
remoteRevision,remotePath,localRevisionandlocalPathfields to.mirror_state.jsonto capture the configuration used when a mirror is run.RepositoryUrito public API and expose it inMirror.javato allow access to the full remote URL in mirror tasks.Result: