Optimal Transport and Mean-field games

Last time we had an introductory look at OT, the definition and computation of the Wasserstein distance.

A few years ago, I wrote a solver for this variational formulation of mean-field games, as part of a project. This eventually led to a small C++ library as a toy project, which might or might not be usable.

Mean-Field Games

The dynamics are defined by \begin{equation} dX_t = \alpha_t X_tdt + \sigma dW_t \end{equation} where $ W $ is a Brownian motion. Typically $ W $ could be an $ \RR^d $-brownian motion with independent components.

\begin{equation} \inf_{(\alpha_t)_t} J(\alpha) = \EE\left[ \int_0^T \frac{1}{2}|\alpha_t|^2 + f(X_t, \rho_t) dt + g(X_T, \rho_T) \right] \end{equation}

where $ \rho_t = \mathcal{L}(X_t) $ is the marginal distribution of the random variable $ X_t $. The optimality condition for this stochastic control problem where established by Lasry and Lions: they are known as the mean-field game equations. They are a system of coupled Fokker-Planck and Hamilton-Jacobi-Bellman equations: \begin{align} \partial_t \rho &= \frac{\sigma^2}{2}\Delta \rho - \mathrm{div}(\rho\nabla u) \ -\partial_t u - \frac{\sigma^2}{2}\Delta u + \frac{1}{2}|\nabla u|^2 &= f(x, \rho) \\ u(T,\cdot) &= g(x, \rho_T) \end{align} These are nonlinear equations due to the control cost term $ |\nabla u|^2 $ in the second, HJB partial differential equation. The first PDE is a linear, Fokker-Planck PDE. The optimal control is given in feedback form as $ \alpha^*_t = \nabla_x u(t, X_t) $. Numerical methods for these equations can involve a fair bit of finite-difference discretisation, Krylov subspace methods...

Here, we lay out a solution method for this particular case of MFG (some might say the control Hamiltonian has a rather specific form...) using optimal transport theory.

A variational approach

Define $ F(\rho) = \int f(x, \rho)d\rho(x) $ and similarly $ G(\rho) = \int g(x, \rho)d\rho(x) $. The variational approach separates the probability distribution path $ {\rho_t} $ as a problem variable, and directly optimizes over the feedback control function $ v(t,x) = \nabla_x u (t,x) $, which we recall should be drift coefficient of the underlying SDE. That formulation, reads $$ \begin{equation} \begin{split} &\inf_{\rho,v} \int_0^T \frac{1}{2}|v(t,x)|_2^2 + F(\rho_t)dt + G(\rho_T) \\ &\suchthat \partial_t\rho = \frac{\sigma^2}{2}\Delta\rho - \mathrm{div}(\rho v) \end{split} \end{equation} $$ The optimality conditions of this equation, should yield the HJB equation from before. There are several ways of solving this problem directly, by casting it as a nonlinear program: the transcription step will discretise the continuous space, (using a grid, triangulated mesh, PL manifold...), and discretise the integral and constraint in time. The obtained finite-dimensional program will be a traditional NLP, but with the structure of a discrete-time optimal control problem (DTOCP). Benamou and Carlier introduced an augmented Lagrangian method for this problem in [1].

[2] introduces a Lagrangian variational formulation of the mean-field game, which reads \begin{equation} \begin{split} &\inf_{Q, {\mu_t}} \mathcal{K}(Q) + \int_0^T F(\mu_t)dt + G(\mu_T) \\ &\suchthat P_tQ = \mu_t \end{split} \end{equation} where the unknown $ Q $ is a measure over the space of trajectories $ \mathbb{X} $, and $ P_t $ denotes the marginal operator at time $ t $, which is the pushforward of the evaluation map $ \delta_t: \omega \longmapsto \omega(t) $.¹ The important part is the kinetic energy term \[ \mathcal{K}(Q) = \int_{\mathbb{X}} \int_{[0,T]} \frac{1}{2} |\dot{\omega}(t)|^2dtdQ(\omega). \]

Now, you may ask where the dynamics went. It's all very subtle here, but they're actually not that far. They're actually baked into the optimization objective over path distributions $ Q $.

The result is that the above problem is an optimal transport problem! Granted, it has an uncountably infinite number of marginals, but we can still discretize it. We consider $ N + 1 $ time steps $(t_0, \ldots, t_N)$ (for convenience, with fixed step size $\Delta t$), and corresponding marginals $ (\mu_0, \ldots, \mu_N) $.

Then, the discrete-time counterpart is a multimarginal optimal transport problem $$ \begin{equation} \inf_{Q, (\mu_i) } \mathcal{K}(Q) + \sum_{i=0}^N F(\mu_i) + G(\mu_N)\ \suchthat P_{t_i}Q = \mu_i. \end{equation} $$

[1] Jean-David Benamou, Guillaume Carlier, Augmented Lagrangian Methods for Transport Optimization, Mean-Field Games and Degenerate PDEs. https://www.ceremade.dauphine.fr/~carlier/ALG2_Draft.pdf
[2] Jean-David Benamou, Guillaume Carlier, Filippo Santambrogio, Variational Mean Field Games. https://hal.archives-ouvertes.fr/hal-01295299/document
[3] Jean-David Benamou, Guillaume Carlier, Simone Marino, Luca Nenna, An entropy minimization approach to second-order variational mean-field games. https://hal.archives-ouvertes.fr/hal-01848370v4/document

Meaning $ P_t $ takes a measure $ Q $ over the space of paths, and maps it to the marginal distribution of $ \omega(t) $ given $ \omega \sim Q $.