Skip to content

Commit 43313a1

Browse files
authored
Merge pull request #14 from felix-yuxiang/master
update the objectives and place it in a better spot
2 parents efaedf8 + 5822038 commit 43313a1

File tree

1 file changed

+2
-4
lines changed

1 file changed

+2
-4
lines changed

_posts/2025-08-18-diff-distill.md

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -94,10 +94,6 @@ We provide some popular instances <d-footnote>We ignore the diffusion models wit
9494

9595
The simplest form of conditional probability path is $$\mathbf{x}_t = (1-t)\mathbf{x}_0 + t\mathbf{x}_1$$ with the corresponding default conditional velocity field OT target $$v(\mathbf{x}_t, t \vert \mathbf{x}_0)=\mathbb{E}[\dot{\mathbf{x}}_t\vert \mathbf{x}_0]=\mathbf{x}_1- \mathbf{x}_0.$$
9696

97-
Borrowed from this [slide](https://rectifiedflow.github.io/assets/slides/icml_07_distillation.pdf) at ICML2025, the objectives of ODE distillation have been categorized into three cases, i.e., (a) **forward loss**, (b) **backward loss** and (c) **self-consistency loss**.
98-
99-
100-
10197
<span style="color: blue; font-weight: bold;">Training</span>: Since minimizing the conditional Flow Matching (FM) loss is equivalent to minimize the marginal FM loss<d-cite key="lipman_flow_2023"></d-cite>, the optimization problem becomes
10298

10399
$$
@@ -118,6 +114,8 @@ At its core, ODE distillation boils down to how to strategically construct the t
118114

119115
In the context of distillation, the forward direction $$s<t$$ is typically taken as the target. Yet, the other direction can also carry meaningful structure. Notice in DDIM<d-cite key="song2020denoising"></d-cite> sampling, the conditional probability path is traversed twice. In our flow map formulation, this can be replaced with the flow maps $$f_{\tau_i\to 0}(\mathbf{x}_{\tau_i}, \tau_i, 0), f_{0\to \tau_{i-1}}(\mathbf{x}_0, 0, \tau_{i-1})$$ where $$0<\tau_{i-1}<\tau_i<1$$. Intuitively, the flow map $$f_{t\to s}(\mathbf{x}_t, t, s)$$ represents a direct mapping of some **displacement field** where $$F_{t\to s}(\mathbf{x}_t, t, s)$$ measures the increment which corresponds to a **velocity field**.
120116

117+
Our unified framework is closely resembles the flow map<d-cite key="boffi2025build"></d-cite>, which transports points along trajectories of solutions to a probability flow ODE system. We provide some new insights on how this framework can connect with many popular distillation methods nowadays. Based on the [slide](https://rectifiedflow.github.io/assets/slides/icml_07_distillation.pdf), the objectives of ODE trajectory distillation have been categorized into three cases, i.e., (a) **forward loss**, (b) **backward loss** and (c) **self-consistency loss**. In the context of self-distilling a flow map model $$f_{t\to s}(\mathbf{x}_t, t, s)$$ from scratch<d-cite key="boffi2025build"></d-cite>, these objectives correspond to equivalent formulations under different names, (a) **Lagrangian Map Distillation loss** (b) **Eulerian Map Distillation loss** and (c) **Progressive self-distillation loss**.
118+
121119
### MeanFlow
122120

123121
MeanFlow<d-cite key="geng2025mean"></d-cite> can be trained from scratch or distilled from a pretrained FM model. The conditional probability path is defined as the linear interpolation between noise and data $$\mathbf{x}_t = (1-t)\mathbf{x}_0 + t\mathbf{x}_1$$ with the corresponding default conditional velocity field OT target $$v(\mathbf{x}_t, t \vert \mathbf{x}_0)=\mathbf{x}_1- \mathbf{x}_0.$$ The main contribution consists of identifying and defining an **average velocity field** which coincides with our flow map as

0 commit comments

Comments
 (0)