Neural ODEs — the continuous-time viewpoint where a neural network parameterizes a vector field and data evolves through an ordinary differential equation (Chen et al., 2018).
Continuous normalizing flows — scalable continuous-time generative flows that connect latent variables to data through learned dynamics (Grathwohl et al., 2019).
Flow matching — the modern training objective that learns a transport vector field directly from simple reference paths between noise and data (Lipman et al., 2023).
A flow-matching model learns a continuous vector field that transports samples from a simple base distribution, usually Gaussian noise, to the target data distribution. Instead of adding noise and then reversing it as in diffusion models, flow matching learns the velocity field of a probability flow.
This perspective is useful because it turns generative modeling into a transport problem. If we know how samples should move at every time \(t \in [0, 1]\), then generation becomes a matter of integrating an ordinary differential equation from noise to data.
15.1 The basic idea
Suppose \(\mathbf{x}_0 \sim p_0\) is a simple reference sample and \(\mathbf{x}_1 \sim p_1\) is a target data sample. We define an interpolation path between them, for example the straight-line path
Flow matching trains a neural network \(\mathbf{v}_\theta(\mathbf{x}, t)\) to predict that velocity from points sampled along the path. A simple objective is
The learned dynamics push the base noise distribution toward the data distribution.
15.2 Why flow matching is interesting
Compared with diffusion models, flow matching often gives a cleaner conceptual picture for inverse problems and conditional generation:
the model is a deterministic transport map rather than a stochastic reverse Markov chain,
sampling can use standard ODE solvers,
and the conditioning logic fits naturally with transport from a prior toward an observation-consistent posterior.
In practice, diffusion and flow matching are closely related. Both learn time-dependent transformations from simple noise to complex data. The difference is mostly in whether the learned process is framed as denoising a stochastic corruption or integrating a deterministic flow.
15.3 Code example: transporting Gaussian noise into a bimodal porosity prior
We reuse the same synthetic porosity distribution as in the diffusion chapter, but now train a velocity network directly. The input is a point on the interpolation path together with the time \(t\), and the output is the velocity that should move that point toward the target distribution.
Epoch 1 flow-matching loss = 2.732322
Epoch 100 flow-matching loss = 1.153513
Epoch 200 flow-matching loss = 1.656030
Epoch 300 flow-matching loss = 1.500055
Epoch 400 flow-matching loss = 1.196840
# Sample by integrating the learned ODE from t = 0 to t = 1functionvelocity_prediction(x, t, ps) t_feature =fill(Float32(t), 1, size(x, 2)) inputs =vcat(x, t_feature) v̂, _ =velocity_net(inputs, ps, st)return v̂endn_samples =600x =randn(rng, Float32, 1, n_samples)n_solver_steps =60dt =1.0f0/ n_solver_stepsfor step in0:n_solver_steps-1 t = step * dt x .+= dt .*velocity_prediction(x, t, ps)endgenerated_porosity =clamp.(x .* σ_data .+ μ_data, 0.0f0, 0.4f0)fig =Figure(size = (620, 320))ax1 =Axis(fig[1, 1], title ="Training data", xlabel ="Porosity", ylabel ="Count")hist!(ax1, vec(x1_data), bins =35, color = (:black, 0.55))ax2 =Axis(fig[1, 2], title ="Flow-matching samples", xlabel ="Porosity", ylabel ="Count")hist!(ax2, vec(generated_porosity), bins =35, color = (:seagreen, 0.65))Label(fig[0, :], "Flow matching: transport from Gaussian noise to porosity prior", fontsize =16)fig
The learned histogram should approximate the two porosity modes again, but the sampling mechanism is now different: there is no reverse noise-removal chain. We simply integrate a learned velocity field from noise to data.
15.4 When to use flow matching
Flow matching is especially appealing when:
You want a continuous-time transport view of generation.
You expect to reuse the model inside conditioning, inversion, or data-assimilation workflows.
Deterministic ODE-based sampling is easier to reason about than stochastic reverse diffusion.
You care about faster generation, because well-trained flows can often be sampled with fewer solver steps than diffusion models need denoising steps.
The main tradeoff is that the model must learn a good global velocity field. If that field is poor, ODE integration can drift into unrealistic regions of state space.
15.5 Geoscience milestones
Flow matching is newer to the geosciences than either GANs or diffusion models, and there is not yet a canonical Earth-science reference. The broader machine-learning-in-geoscience reviews Bergen et al. (2019) and Dramsch (2020) are the closest pointers and frame the uncertainty-aware, inverse-problem-driven perspective into which flow matching is starting to be placed.
Bergen, K. J., Johnson, P. A., Hoop, M. V. de, & Beroza, G. C. (2019). Machine learning for data-driven discovery in solid earth geoscience. Science, 363(6433). https://doi.org/10.1126/science.aau0323
Chen, R. T. Q., Rubanova, Y., Bettencourt, J., & Duvenaud, D. (2018). Neural ordinary differential equations. Advances in Neural Information Processing Systems, 31.
Grathwohl, W., Chen, R. T. Q., Bettencourt, J., Sutskever, I., & Duvenaud, D. (2019). FFJORD: Free-form continuous dynamics for scalable reversible generative models. Proceedings of the International Conference on Learning Representations (ICLR).
Lipman, Y., Chen, R. T. Q., Ben-Hamu, H., Nickel, M., & Le, M. (2023). Flow matching for generative modeling. Proceedings of the International Conference on Learning Representations (ICLR).