using NeuralPDE, Lux, ModelingToolkit, Optimization, OptimizationOptimisers,
OptimizationOptimJL, Random, Printf, CairoMakie
import ModelingToolkit: Interval
rng = Xoshiro(42)
α = 0.01
u_exact(x, t) = exp(-α * π^2 * t) * sin(π * x)16 Physics-Informed Neural Networks
- Early neural-network PDE solvers — using neural networks to solve ODEs and PDEs by embedding the equation in the loss function (Lagaris et al., 1998).
- Physics-Informed Neural Networks (PINNs) — the foundational framework that popularized training neural networks with PDE residuals as soft constraints (Raissi et al., 2019).
- PINN review — comprehensive review of physics-informed machine learning covering PINNs, operator learning, and hybrid methods (Karniadakis et al., 2021).
- DeepXDE — a widely used library for PINNs that formalized many practical training strategies (Lu et al., 2021).
- Gradient pathologies — understanding and mitigating the training difficulties unique to PINNs (Wang et al., 2021).
- Neural ODEs — treating the forward pass of a neural network as an ODE, connecting deep learning and differential equations (Chen et al., 2018).
- Universal differential equations — embedding neural networks inside differential equations to learn unknown physics (Rackauckas et al., 2020).
In the previous part of this book we treated neural networks as function approximators: given input–output pairs, the network learns to map one to the other. Physics-informed neural networks (PINNs) take a fundamentally different approach. Instead of (or in addition to) learning from data, the network is trained to satisfy a known physical law — typically a partial differential equation (PDE). The physics itself becomes part of the loss function.
16.1 The idea
Suppose we want to solve the PDE:
\[ \mathcal{N}[u](\mathbf{x}, t) = 0, \qquad \mathbf{x} \in \Omega, \quad t \in [0, T] \]
with boundary conditions \(\mathcal{B}[u] = 0\) on \(\partial\Omega\) and initial conditions \(u(\mathbf{x}, 0) = u_0(\mathbf{x})\).
A PINN approximates the solution \(u(\mathbf{x}, t)\) with a neural network \(u_\theta(\mathbf{x}, t)\), where \(\theta\) denotes the trainable parameters. The network takes coordinates \((\mathbf{x}, t)\) as input and outputs the predicted solution value.
The key insight is that because the network is differentiable, we can compute \(\frac{\partial u_\theta}{\partial t}\), \(\frac{\partial^2 u_\theta}{\partial x^2}\), etc. using automatic differentiation. We then define the PDE residual:
\[ r_\theta(\mathbf{x}, t) = \mathcal{N}[u_\theta](\mathbf{x}, t) \]
and train the network to make this residual zero everywhere.
16.2 The PINN loss function
The total loss combines three named components:
\[ \mathcal{L}(\theta) = \lambda_r \mathcal{L}_r(\theta) + \lambda_b \mathcal{L}_{\mathrm{bc}}(\theta) + \lambda_0 \mathcal{L}_0(\theta) \]
with
\[ \mathcal{L}_r(\theta) = \frac{1}{N_r}\sum_{i=1}^{N_r} |r_\theta(\mathbf{x}_i^r, t_i^r)|^2 \]
\[ \mathcal{L}_{\mathrm{bc}}(\theta) = \frac{1}{N_b}\sum_{j=1}^{N_b} |\mathcal{B}[u_\theta](\mathbf{x}_j^b, t_j^b)|^2 \]
\[ \mathcal{L}_0(\theta) = \frac{1}{N_0}\sum_{k=1}^{N_0} |u_\theta(\mathbf{x}_k^0, 0) - u_0(\mathbf{x}_k^0)|^2 \]
where:
- \(\{(\mathbf{x}_i^r, t_i^r)\}\) are collocation points sampled inside the domain where the PDE must hold.
- \(\{(\mathbf{x}_j^b, t_j^b)\}\) are points on the boundary.
- \(\{(\mathbf{x}_k^0)\}\) are points at \(t=0\).
- \(\lambda_r, \lambda_b, \lambda_0\) are weighting factors.
If observational data is also available, a fourth term \(\mathcal{L}_{\mathrm{data}}\) can be added, making the PINN a hybrid physics-data model.
16.3 NeuralPDE.jl: a symbolic interface for PINNs
Writing the residual loss by hand — sampling collocation points, computing derivatives with Zygote, and balancing the weight factors — is instructive but quickly becomes tedious for non-trivial PDEs. The Julia package NeuralPDE.jl automates this. The user states the PDE, boundary conditions, and domain symbolically through ModelingToolkit.jl; NeuralPDE.jl then discretizes the residual on collocation points, wires up automatic differentiation, and returns a standard Optimization.jl problem.
The high-level workflow is:
- Declare independent variables, dependent variables, and differential operators.
- Write the PDE and the initial/boundary conditions as symbolic equations.
- Define the domain as a set of intervals.
- Build a
Luxchain for \(u_\theta\) and pass it toPhysicsInformedNN. - Call
discretizeto obtain anOptimizationProblem. - Solve it with any
Optimization.jloptimizer (e.g.Adam, thenBFGS).
16.4 Code example: solving the 1D heat equation
We solve the heat equation \(\frac{\partial u}{\partial t} = \alpha \frac{\partial^2 u}{\partial x^2}\) on \(x \in [0, 1]\), \(t \in [0, 1]\) with initial condition \(u(x, 0) = \sin(\pi x)\) and boundary conditions \(u(0, t) = u(1, t) = 0\). The exact solution is \(u(x,t) = e^{-\alpha \pi^2 t} \sin(\pi x)\).
#-----symbolic PDE description------
@parameters x t
@variables u(..)
Dt = Differential(t)
Dxx = Differential(x)^2
eq = Dt(u(x, t)) ~ α * Dxx(u(x, t))
bcs = [u(x, 0) ~ sin(π * x),
u(0, t) ~ 0.0,
u(1, t) ~ 0.0]
domains = [x ∈ Interval(0.0, 1.0),
t ∈ Interval(0.0, 1.0)]
@named pdesys = PDESystem(eq, bcs, domains, [x, t], [u(x, t)])#-----PINN architecture------
chain = Chain(Dense(2, 32, tanh),
Dense(32, 32, tanh),
Dense(32, 32, tanh),
Dense(32, 1))
#-----collocation strategy------
strategy = GridTraining(0.05)
discretization = PhysicsInformedNN(chain, strategy)
prob = discretize(pdesys, discretization)#-----training: Adam (warm start) then BFGS (sharpen)------
iter = Ref(0)
callback = function (p, l)
iter[] += 1
(iter[] == 1 || iter[] % 200 == 0) && @printf "iter %5d loss = %.3e\n" iter[] l
return false
end
res = Optimization.solve(prob, Adam(1f-2); callback = callback, maxiters = 1500)
res = Optimization.solve(remake(prob, u0 = res.u), BFGS();
callback = callback, maxiters = 500)
phi = discretization.phi#-----evaluate the PINN solution on a grid------
nx, nt = 80, 80
xs = collect(range(0, 1, length = nx))
ts = collect(range(0, 1, length = nt))
U_pinn = [first(phi([xi, tj], res.u)) for xi in xs, tj in ts]
U_true = [u_exact(xi, tj) for xi in xs, tj in ts]
fig = Figure(size = (820, 280))
ax1 = Axis(fig[1, 1], title = "PINN solution", xlabel = "x", ylabel = "t")
ax2 = Axis(fig[1, 2], title = "Exact solution", xlabel = "x", ylabel = "t")
ax3 = Axis(fig[1, 3], title = "Absolute error", xlabel = "x", ylabel = "t")
heatmap!(ax1, xs, ts, U_pinn, colormap = :viridis)
heatmap!(ax2, xs, ts, U_true, colormap = :viridis)
heatmap!(ax3, xs, ts, abs.(U_pinn .- U_true), colormap = :hot)
figA few observations worth keeping in mind when comparing this to a hand-written PINN:
- No explicit residual code.
NeuralPDE.jlreads the symbolic equation and constructs the loss for us. Switching from the heat equation to a wave equation is a one-line change ineq. - Strategy is configurable.
GridTraining(0.05)evaluates the residual on a fixed grid. Alternatives includeStochasticTraining,QuasiRandomTraining, andQuadratureTraining— each with different bias/variance trade-offs for the residual estimator. - Two-stage optimization. It is common to use
Adamfirst (cheap, noisy gradients) and finish withBFGS(sharper convergence near the minimum). Theremakecall hands the Adam solution to BFGS as a warm start.
16.5 PINN strengths and limitations
Strengths:
- Mesh-free — no need to build a computational mesh. The network is evaluated at arbitrary points.
- Inverse problems — if some physical parameters are unknown, they can be made trainable alongside the network weights. The PINN then simultaneously solves the PDE and infers the unknown parameters.
- Data assimilation — observational data can be incorporated by adding a data-fitting term to the loss, making PINNs natural for problems where sparse data and physics must both be honored.
Limitations:
- Training difficulty — PINNs can be hard to train, especially for stiff PDEs, multi-scale problems, or high-frequency solutions. Gradient pathologies and loss balancing are active research challenges (Wang et al., 2021).
- Computational cost — computing second-order derivatives through automatic differentiation is expensive. For problems where a traditional numerical solver is efficient, PINNs may not be competitive.
- Generalization — a trained PINN solves one specific instance of a PDE (one set of boundary/initial conditions). To solve the same PDE with different conditions, you must retrain. This is where neural operators (next chapter) offer an advantage.
16.6 Geoscience milestones
- Seismic travel times via the eikonal equation — Smith et al. (2021) (EikoNet) and Waheed et al. (2021) (PINNeik) are the canonical references for embedding the eikonal equation in a PINN to compute travel times in heterogeneous 3D velocity models.
- Subsurface flow with physics-constrained deep learning — Zhu et al. (2019) established physics-constrained deep learning as a high-dimensional surrogate-modelling and uncertainty-quantification tool for subsurface transport.
- Physics-informed ML for geophysics — review — Alkhalifah et al. (2022) surveys physics-informed approaches across seismic imaging, inversion, and interpretation.