Hamiltonian Monte Carlo

Suppose the target distribution to sample is f(\mathbf{x}) for \mathbf{x} \in \mathbb{R}^d (d \ge 1 ) and a chain of samples \mathbf{X}_0,\mathbf{X}_1,\mathbf{X}_2,\ldots is required. Hamilton's equations are :\frac{\text{d} x_i}{\text{d} t}=\frac{\partial H}{\partial p_i} \quad \text{and} \quad \dfrac{\text{d} p_i}{\text{d} t}=-\dfrac{\partial H}{\partial x_i} where x_i and p_i are the ith component of the position and momentum vector respectively and H is the Hamiltonian. Let M be a mass matrix which is symmetric and positive definite, then the Hamiltonian is :H(\mathbf{x},\mathbf{p})=U(\mathbf{x}) + \dfrac{1}{2}\mathbf{p}^\text{T} M^{-1}\mathbf{p} where U(\mathbf{x}) is the potential energy. The potential energy for a target is given as :U(\mathbf{x}) = -\ln f(\mathbf{x}) which comes from the Boltzmann's factor. Note that the Hamiltonian H is dimensionless in this formulation because the exponential probability weight \exp\left(-H\right) has to be well defined. For example, in simulations at finite temperature T the factor k_{\text{B}}T (with the Boltzmann constant k_{\text{B}}) is directly absorbed into U and M. The algorithm requires a positive integer for number of leapfrog steps L and a positive number for the step size \Delta t. Suppose the chain is at \mathbf{X}_n=\mathbf{x}_n. Let \mathbf{x}_n(0) = \mathbf{x}_n. First, a random Gaussian momentum \mathbf{p}_n(0) is drawn from \text{N}\left(\mathbf{0},M\right). Next, the particle will run under Hamiltonian dynamics for time L\Delta t, this is done by solving the Hamilton's equations numerically using the leapfrog algorithm. The position and momentum vectors after time \Delta t using the leapfrog algorithm are: :\mathbf{p}_n\left(t+\dfrac{\Delta t}{2}\right) = \mathbf{p}_n(t) - \dfrac{\Delta t}{2}\nabla \left.U(\mathbf{x})\right|_{\mathbf{x} = \mathbf{x}_n(t)} :\mathbf{x}_n(t+\Delta t) = \mathbf{x}_n(t)+\Delta t M^{-1}\mathbf{p}_n\left(t+\dfrac{\Delta t}{2}\right) :\mathbf{p}_n(t+\Delta t) = \mathbf{p}_n\left(t+\dfrac{\Delta t}{2}\right) -\dfrac{\Delta t}{2}\nabla \left.U(\mathbf{x})\right|_{\mathbf{x} = \mathbf{x}_n(t+\Delta t)} These equations are to be applied to \mathbf{x}_n(0) and \mathbf{p}_n(0) L times to obtain \mathbf{x}_n(L\Delta t) and \mathbf{p}_n(L\Delta t). The leapfrog algorithm is an approximate solution to the motion of non-interacting classical particles. If exact, the solution will never change the initial randomly-generated energy distribution, as energy is conserved for each particle in the presence of a classical potential energy field. In order to reach a thermodynamic equilibrium distribution, particles must have some sort of interaction with, for example, a surrounding heat bath, so that the entire system can take on different energies with probabilities according to the Boltzmann distribution. One way to move the system towards a thermodynamic equilibrium distribution is to change the state of the particles using the Metropolis–Hastings algorithm. So first, one applies the leapfrog step, then a Metropolis-Hastings step. The transition from \mathbf{X}_n = \mathbf{x}_n to \mathbf{X}_{n+1} is :\mathbf{X}_{n+1}|\mathbf{X}_{n}=\mathbf{x}_n = \begin{cases} \mathbf{x}_{n}(L\Delta t) & \text{with probability } \alpha\left(\mathbf{x}_{n}(0),\mathbf{x}_{n}(L\Delta t)\right) \\ \mathbf{x}_n(0) & \text{otherwise} \end{cases} where : \alpha\left(\mathbf{x}_{n}(0),\mathbf{x}_{n}(L\Delta t)\right)= \text{min}\left( 1, \dfrac{ \exp\left[ -H( \mathbf{x}_n(L\Delta t),\mathbf{p}_n(L\Delta t) ) \right] } { \exp\left[ -H( \mathbf{x}_n(0),\mathbf{p}_n(0) ) \right] } \right). A full update consists of first randomly sampling the momenta \mathbf{p} (independently of the previous iterations), then integrating the equations of motion (e.g. with leapfrog), and finally obtaining the new configuration from the Metropolis-Hastings accept/reject step. This updating mechanism is repeated to obtain \mathbf{X}_{n+1},\mathbf{X}_{n+2},\mathbf{X}_{n+3},\ldots. ==No U-Turn Sampler==