Itô's lemma

In mathematics, Itô's lemma or Itô's formula is an identity used in Itô calculus to find the differential of a time-dependent function of a stochastic process. It serves as the stochastic calculus counterpart of the chain rule. It can be heuristically derived by forming the Taylor series expansion of the function up to its second derivatives and retaining terms up to first order in the time increment and second order in the Wiener process increment. The lemma is widely employed in mathematical finance, and its best known application is in the derivation of the Black–Scholes equation for option values.

Motivation

Suppose we are given the stochastic differential equation dX_t = \mu_t\ dt + \sigma_t\ dB_t, where is a Wiener process and the functions \mu_t, \sigma_t are deterministic (not stochastic) functions of time. In general, it's not possible to write a solution X_t directly in terms of B_t. However, we can formally write an integral solution X_t = \int_0^t \mu_s\ ds + \int_0^t \sigma_s\ dB_s. This expression lets us easily read off the mean and variance of X_t (which has no higher moments). First, notice that every \mathrm{d}B_t individually has mean 0, so the expected value of X_t is simply the integral of the drift function: \mathrm E[X_t]=\int_0^t \mu_s\ ds. Similarly, because the dB terms have variance 1 and no correlation with one another, the variance of X_t is simply the integral of the variance of each infinitesimal step in the random walk: \mathrm{Var}[X_t] = \int_0^t\sigma_s^2\ ds. However, sometimes we are faced with a stochastic differential equation for a more complex process Y_t, in which the process appears on both sides of the differential equation. That is, say dY_t = a_1(Y_t,t) \ dt + a_2(Y_t,t)\ dB_t, for some functions a_1 and a_2. In this case, we cannot immediately write a formal solution as we did for the simpler case above. Instead, we hope to write the process Y_t as a function of a simpler process X_t taking the form above. That is, we want to identify three functions f(t,x), \mu_t, and \sigma_t, such that Y_t=f(t, X_t) and dX_t = \mu_t\ dt + \sigma_t\ dB_t. In practice, Ito's lemma is used in order to find this transformation. Finally, once we have transformed the problem into the simpler type of problem, we can determine the mean and higher moments of the process. == Derivation ==

Derivation

We derive Itô's lemma by expanding a Taylor series and applying the rules of stochastic calculus. Suppose X_t is an Itô drift-diffusion process that satisfies the stochastic differential equation dX_t= \mu_t \, dt + \sigma_t \, dB_t, where is a Wiener process. If is a twice-differentiable scalar function, its expansion in a Taylor series is \begin{align} \frac{\Delta f(t)}{dt}dt &= f(t+dt, x) - f(t,x) \\ &= \frac{\partial f}{\partial t}\,dt + \frac{1}{2}\frac{\partial^2 f}{\partial t^2}\,(dt)^2 + \cdots \\[1ex] \frac{\Delta f(x)}{dx}dx &= f(t, x+dx) - f(t,x) \\ &= \frac{\partial f}{\partial x}\,dx + \frac{1}{2}\frac{\partial^2 f}{\partial x^2}\,(dx)^2 + \cdots \end{align} Then use the total derivative and the definition of the partial derivative f_y=\lim_{dy\to0}\frac{\Delta f(y)}{dy}: \begin{align} df &= f_t dt + f_x dx \\[1ex] &= \lim_{dx \to 0 \atop dt \to 0} \frac{\partial f}{\partial t}\,dt + \frac{\partial f}{\partial x}\,dx + \frac{1}{2} \left(\frac{\partial^2 f}{\partial t^2}\,(dt)^2 + \frac{\partial^2 f}{\partial x^2}\,(dx)^2\right) + \cdots . \end{align} Substituting x=X_t and therefore dx=dX_t=\mu_t\,dt + \sigma_t\,dB_t, we get \begin{align} df = \lim_{dB_t \to 0 \atop dt \to 0} \; & \frac{\partial f}{\partial t}\,dt + \frac{\partial f}{\partial x} \left(\mu_t\,dt + \sigma_t\,dB_t\right) \\ &+ \frac{1}{2} \left[ \frac{\partial^2 f}{\partial t^2}\,{\left(dt\right)}^2 + \frac{\partial^2 f}{\partial x^2} \left (\mu_t^2\,{\left(dt\right)}^2 + 2 \mu_t \sigma_t \, dt \, dB_t + \sigma_t^2 \, {\left(dB_t\right)}^2 \right ) \right] + \cdots. \end{align} In the limit dt\to0, the terms (dt)^2 and dt\,dB_t tend to zero faster than dt. (dB_t)^2 is O(dt) (due to the quadratic variation of a Wiener process which says B_t^2=O(t)), so setting (dt)^2, dt\,dB_t and (dx)^3 terms to zero and substituting dt for (dB_t)^2, and then collecting the dt terms, we obtain df = \lim_{dt\to0}\left(\frac{\partial f}{\partial t} + \mu_t\frac{\partial f}{\partial x} + \frac{\sigma_t^2}{2}\frac{\partial^2 f}{\partial x^2}\right)dt + \sigma_t\frac{\partial f}{\partial x}\,dB_t as required. Alternatively, df = \lim_{dt\to0}\left(\frac{\partial f}{\partial t} + \frac{\sigma_t^2}{2}\frac{\partial^2 f}{\partial x^2}\right)dt + \frac{\partial f}{\partial x}\,dX_t == Geometric intuition ==

Geometric intuition

File:Ito lemma 1D illustration.svg|thumb|321x321px|When X_{t+dt} is a Gaussian random variable, f(X_{t+dt}) is also an approximately Gaussian random variable, but its mean E[f(X_{t+dt})] differs from f(E[X_{t+dt}]) by a factor proportional to f''(E[X_{t+dt}]) and the variance of X_{t+dt}. Suppose we know that X_t, X_{t+dt} are two jointly-Gaussian distributed random variables, and f is nonlinear but has a continuous second derivative, then in general, neither of f(X_t), f(X_{t+dt}) is Gaussian, and their joint distribution is also not Gaussian. However, since X_{t+dt} \mid X_t is Gaussian, we might still find f(X_{t+dt}) \mid f(X_t) is Gaussian. This is not true when dt is finite, but when dt becomes infinitesimal, this becomes true. The key idea is that X_{t+dt} = X_t + \mu_t \, dt + dW_t has a deterministic part and a noisy part. When f is nonlinear, the noisy part has a deterministic contribution. If f is convex, then the deterministic contribution is positive (by Jensen's inequality). To find out how large the contribution is, we write X_{t + dt} = X_t + \mu_t \, dt + \sigma_t \sqrt{dt} \, z, where z is a standard Gaussian, then perform Taylor expansion. \begin{aligned} f(X_{t+dt}) ={}& f(X_t) + f'(X_t) \mu_t \, dt + f'(X_t) \sigma_t \sqrt{dt} \, z \\[1ex] & + \frac{1}{2} f''(X_t) \left(\sigma_t^2 z^2 \, dt + 2 \mu_t \sigma_t z \, dt^{3/2} + \mu_t^2 dt^2\right) + o(dt) \\[2ex] ={}& \left[f(X_t) + f'(X_t) \mu_t \, dt + \frac{1}{2} f''(X_t) \sigma_t^2 \, dt + o(dt)\right] \\[1ex] & + \left[f'(X_t)\sigma_t \sqrt{dt} \, z + \frac{1}{2} f''(X_t) \sigma_t^2 \left(z^2 - 1\right) \, dt + o(dt)\right] \end{aligned}We have split it into two parts, a deterministic part, and a random part with mean zero. The random part is non-Gaussian, but the non-Gaussian parts decay faster than the Gaussian part, and at the dt \to 0 limit, only the Gaussian part remains. The deterministic part has the expected f(X_t) + f'(X_t) \mu_t \, dt , but also a part contributed by the convexity: \frac{1}{2} f''(X_t) \sigma_t^2 \, dt. To understand why there should be a contribution due to convexity, consider the simplest case of geometric Brownian walk (of the stock market): S_{t+dt} = S_t ( 1 + dB_t). In other words, d(\ln S_t) = dB_t. Let X_t = \ln S_t, then S_t = e^{X_t}, and X_t is a Brownian walk. However, although the expectation of X_t remains constant, the expectation of S_t grows. Intuitively it is because the downside is limited at zero, but the upside is unlimited. That is, while X_t is normally distributed, S_t is log-normally distributed. == Mathematical formulation of Itô's lemma ==

Mathematical formulation of Itô's lemma

In the following subsections we discuss versions of Itô's lemma for different types of stochastic processes. Itô drift-diffusion processes (due to: Kunita–Watanabe) In its simplest form, Itô's lemma states the following: for an Itô drift-diffusion process dX_t= \mu_t \, dt + \sigma_t \, dB_t and any twice differentiable scalar function of two real variables and , one has df(t,X_t) =\left(\frac{\partial f}{\partial t} + \mu_t \frac{\partial f}{\partial x} + \frac{\sigma_t^2}{2}\frac{\partial^2f}{\partial x^2}\right)dt+ \sigma_t \frac{\partial f}{\partial x}\,dB_t. This immediately implies that is itself an Itô drift-diffusion process. In higher dimensions, if \mathbf{X}_t = (X^1_t, X^2_t, \ldots, X^n_t)^T is a vector of Itô processes such that d\mathbf{X}_t = \boldsymbol{\mu}_t\, dt + \mathbf{G}_t\, d\mathbf{B}_t for a vector \boldsymbol{\mu}_t and matrix \mathbf{G}_t, Itô's lemma then states that \begin{align} df(t,\mathbf{X}_t) &= \frac{\partial f}{\partial t}\, dt + \left (\nabla_\mathbf{X} f \right )^T\, d\mathbf{X}_t + \frac{1}{2} \left(d\mathbf{X}_t \right )^T \left( H_\mathbf{X} f \right) \, d\mathbf{X}_t, \\[4pt] &= \left\{ \frac{\partial f}{\partial t} + \left (\nabla_\mathbf{X} f \right)^T \boldsymbol{\mu}_t + \frac{1}{2} \operatorname{Tr} \left[ \mathbf{G}_t^T \left( H_\mathbf{X} f \right) \mathbf{G}_t \right] \right\} \, dt + \left (\nabla_\mathbf{X} f \right)^T \mathbf{G}_t\, d\mathbf{B}_t \end{align} where \nabla_\mathbf{X} f is the gradient of w.r.t. , is the Hessian matrix of w.r.t. , and is the trace operator. Poisson jump processes We may also define functions on discontinuous stochastic processes. Let be the jump intensity. The Poisson process model for jumps is that the probability of one jump in the interval is plus higher order terms. could be a constant, a deterministic function of time, or a stochastic process. The survival probability is the probability that no jump has occurred in the interval . The change in the survival probability is d p_s(t) = -p_s(t) h(t) \, dt, so p_s(t) = \exp \left(-\int_0^t h(u) \, du \right). Let be a discontinuous stochastic process. Write S(t^-) for the value of S as we approach t from the left. Write d_j S(t) for the non-infinitesimal change in as a result of a jump. Then d_j S(t) = \lim_{\Delta t \to 0} \left[S(t + \Delta t) - S(t^-)\right]. Let be the magnitude of the jump and let \eta(S(t^-),z) be the distribution of . The expected magnitude of the jump is \operatorname{E}[d_j S(t)]=h(S(t^-)) \, dt \int_z z \eta(S(t^-),z) \, dz. Now, define the compensated process J_S(t) associated with S(t), which simply means that we subtract off the mean change in S(t) so that J_S(t) is a martingale. Hence the increment to J_S(t) is: \begin{align} dJ_S(t) &= d_j S(t) - \operatorname{E}[d_j S(t)] \\[1ex] &= S(t)-S(t^-) - \left ( h(S(t^-))\int_z z \eta \left (S(t^-),z \right) \, dz \right ) \, dt. \end{align} Then \begin{align} d_j S(t) &= E[d_j S(t)] + d J_S(t) \\[1ex] &= h(S(t^-)) \left(\int_z z \eta(S(t^-),z) \, dz \right) dt + d J_S(t). \end{align} Consider a function g(S(t),t) of the jump process . If jumps by then jumps by . is drawn from distribution \eta_g() which may depend on g(t^-), dg and S(t^-). The jump part of g is g(t)-g(t^-) =h(t) \, dt \int_{\Delta g} \, \Delta g \eta_g(\cdot) \, d\Delta g + d J_g(t). If S contains drift, diffusion and jump parts, then Itô's Lemma for g(S(t),t) is \begin{align} dg(t) ={}& \left( \frac{\partial g}{\partial t}+\mu \frac{\partial g}{\partial S}+\frac{\sigma^2}{2} \frac{\partial^2 g}{\partial S^2} + h(t) \int_{\Delta g} \left (\Delta g \eta_g(\cdot) \, d{\Delta}g \right ) \, \right) dt \\ & + \frac{\partial g}{\partial S} \sigma \, dW(t) + dJ_g(t). \end{align} Itô's lemma for a process which is the sum of a drift-diffusion process and a jump process is just the sum of the Itô's lemma for the individual parts. Discontinuous semimartingales Itô's lemma can also be applied to general -dimensional semimartingales, which need not be continuous. In general, a semimartingale is a càdlàg process, and an additional jump term needs to be added to the Itô's formula. For any cadlag process , the left limit in is denoted by , which is a left-continuous process. The jumps are written as . Then, Itô's lemma states that if is a -dimensional semimartingale and f is a twice continuously differentiable real valued function on then f(X) is a semimartingale, and \begin{align} f(X_t) = f(X_0) &+ \sum_{i=1}^d\int_0^t f_{i}(X_{s-}) \, dX^i_s + \frac{1}{2}\sum_{i,j=1}^d \int_0^t f_{i,j}(X_{s-})\,d[X^i,X^j]_s \\ &+ \sum_{s\le t} \left(\Delta f(X_s)-\sum_{i=1}^df_{i}(X_{s-})\,\Delta X^i_s - \frac{1}{2}\sum_{i,j=1}^d f_{i,j}(X_{s-})\,\Delta X^i_s \, \Delta X^j_s\right). \end{align} This differs from the formula for continuous semi-martingales by the last term summing over the jumps of X, which ensures that the jump of the right hand side at time is Δf(Xt). == Examples ==

Examples

Geometric Brownian motion A process S is said to follow a geometric Brownian motion with constant volatility σ and constant drift μ if it satisfies the stochastic differential equation dS_t = \sigma S_t\,dB_t + \mu S_t\,dt, for a Brownian motion . Applying Itô's lemma with f(S_t) = \log(S_t) gives \begin{align} df & = f'(S_t) \, dS_t + \frac{1}{2} f'' (S_t) \, {\left(dS_t\right)}^2 \\[4pt] & = \frac{1}{S_t}\,dS_t + \frac{1}{2} \left(-S_t^{-2}\right) \left(S_t^2 \sigma^2 \, dt\right) \\[4pt] & = \frac{1}{S_t} \left( \sigma S_t \, dB_t + \mu S_t \, dt\right) - \frac{\sigma^2}{2} \, dt \\[4pt] &= \sigma \, dB_t + \left(\mu - \tfrac{\sigma^2}{2} \right) dt. \end{align} It follows that \log (S_t) = \log (S_0) + \sigma B_t + \left (\mu-\tfrac{\sigma^2}{2} \right )t, exponentiating gives the expression for , S_t = S_0 \exp\left(\sigma B_t + \left (\mu - \tfrac{\sigma^2}{2} \right )t\right). The correction term of corresponds to the difference between the median and mean of the log-normal distribution, or equivalently for this distribution, the geometric mean and arithmetic mean, with the median (geometric mean) being lower. This is due to the AM–GM inequality, and corresponds to the logarithm being concave (or convex upwards), so the correction term can accordingly be interpreted as a convexity correction. This is an infinitesimal version of the fact that the annualized return is less than the average return, with the difference proportional to the variance. See geometric moments of the log-normal distribution for further discussion. The same factor of appears in the d1 and d2 auxiliary variables of the Black–Scholes formula, and can be interpreted as a consequence of Itô's lemma. Doléans-Dade exponential The Doléans-Dade exponential (or stochastic exponential) of a continuous semimartingale X can be defined as the solution to the SDE with initial condition . It is sometimes denoted by . Applying Itô's lemma with gives \begin{align} d\log(Y) &= \frac{1}{Y}\,dY -\frac{1}{2Y^2}\,d[Y] \\[6pt] &= dX - \tfrac{1}{2}\,d[X]. \end{align} Exponentiating gives the solution Y_t = \exp\left(X_t-X_0-\tfrac{1}{2} [X]_t\right). Black–Scholes formula Itô's lemma can be used to derive the Black–Scholes equation for an option. Suppose a stock price follows a geometric Brownian motion given by the stochastic differential equation . Then, if the value of an option at time is , Itô's lemma gives df(t,S_t) = \left(\frac{\partial f}{\partial t} + \frac{1}{2}\left(S_t\sigma\right)^2\frac{\partial^2 f}{\partial S^2}\right)\,dt +\frac{\partial f}{\partial S}\,dS_t. The term represents the change in value in time of the trading strategy consisting of holding an amount of the stock. If this trading strategy is followed, and any cash held is assumed to grow at the risk free rate , then the total value V of this portfolio satisfies the SDE dV_t = r\left(V_t-\frac{\partial f}{\partial S}S_t\right)\,dt + \frac{\partial f}{\partial S}\,dS_t. This strategy replicates the option if . Combining these equations gives the celebrated Black–Scholes equation \frac{\partial f}{\partial t} + \frac{\sigma^2S^2}{2}\frac{\partial^2 f}{\partial S^2} + rS\frac{\partial f}{\partial S}-rf = 0. Product rule for Itô processes Let \mathbf X_t be a two-dimensional Ito process with SDE: d\mathbf X_t = d\begin{pmatrix} X_t^1 \\ X_t^2 \end{pmatrix} = \begin{pmatrix} \mu_t^1 \\ \mu_t^2 \end{pmatrix} dt + \begin{pmatrix} \sigma_t^1 \\ \sigma_t^2 \end{pmatrix} \, dB_t Then we can use the multi-dimensional form of Ito's lemma to find an expression for d(X_t^1X_t^2). We have \mu_t = \begin{pmatrix} \mu_t^1 \\ \mu_t^2 \end{pmatrix} and \mathbf G = \begin{pmatrix} \sigma_t^1 \\ \sigma_t^2 \end{pmatrix}. We set f(t,\mathbf X_t) = X_t^1 X_t^2 and observe that {{nowrap|\frac{\partial f}{\partial t} = 0,}} {{nowrap|(\nabla_\mathbf X f)^T = \begin{pmatrix} X_t^2 & X_t^1 \end{pmatrix},}} and H_\mathbf X f = \begin{pmatrix} 0 & 1 \\ 1 & 0 \end{pmatrix} Substituting these values in the multi-dimensional version of the lemma gives us: \begin{align} d(X_t^1X_t^2) &= df(t, \mathbf X_t) \\ &= 0 \cdot dt + \begin{pmatrix} X_t^2 & X_t^1 \end{pmatrix} \, d\mathbf X_t + \frac{1}{2} \begin{pmatrix} dX_t^1 & dX_t^2 \end{pmatrix} \begin{pmatrix} 0 & 1 \\ 1 & 0 \end{pmatrix} \begin{pmatrix} dX_t^1 \\ dX_t^2 \end{pmatrix} \\[1ex] &= X_t^2 \, dX_t^1 + X^1_t \, dX_t^2 + dX_t^1 \, dX_t^2 \end{align} This is a generalisation of Leibniz's product rule to Ito processes, which are non-differentiable. Further, using the second form of the multidimensional version above gives us \begin{align} d(X_t^1 X_t^2) &= \left\{ 0 + \begin{pmatrix} X_t^2 & X_t^1 \end{pmatrix} \begin{pmatrix} \mu_t^1 \\ \mu_t^2 \end{pmatrix} + \frac{1}{2} \operatorname{Tr} \left[ \begin{pmatrix} \sigma_t^1 & \sigma_t^2 \end{pmatrix} \begin{pmatrix} 0 & 1 \\ 1 & 0 \end{pmatrix} \begin{pmatrix} \sigma_t^1 \\ \sigma_t^2 \end{pmatrix} \right] \right\} dt \\[1ex] & \qquad + \left(X_t^2 \sigma_t^1 + X^1_t \sigma_t^2\right) dB_t\\[2ex] &= \left(X_t^2 \mu_t^1 + X^1_t \mu_t^2 + \sigma_t^1\sigma_t^2\right) dt + \left(X_t^2 \sigma_t^1 + X^1_t \sigma_t^2\right) dB_t \end{align} so we see that the product X_t^1X_t^2 is itself an Itô drift-diffusion process. == Itô's formula for functions with finite quadratic variation ==

Itô's formula for functions with finite quadratic variation

Hans Föllmer provided a non-probabilistic proof of the Itô formula and showed that it holds for all functions with finite quadratic variation. Let f\in C^2 be a real-valued function and x:[0,\infty]\to \mathbb{R} a right-continuous function with left limits and finite quadratic variation [x]. Then \begin{align} f(x_t) = f(x_0) &+ \int_0^t f'(x_{s-}) \, \mathrm{d}x_s + \frac{1}{2}\int_{]0,t]} f''(x_{s-}) \, d[x]_s \\ & + \sum_{0\leq s\leq t}\left[f(x_s)-f(x_{s-})-f'(x_{s-})\Delta x_s - \frac{1}{2} f''(x_{s-})(\Delta x_s)^2\right]. \end{align} where the quadratic variation of x is defined as a limit along a sequence of partitions D_n of [0,t] with step decreasing to zero: [x](t) = \lim_{n\to\infty} \sum_{t^n_k \in D_n} \left(x_{t^n_{k+1}} - x_{t^n_k}\right)^2. == Higher-order Itô formula==

Higher-order Itô formula

Rama Cont and Nicolas Perkowski extended the Itô formula to functions with finite -th variation where p\geq 2 is an arbitrarily large integer. Given a continuous function with finite p-th variation [x]^p(t) = \lim_{n\to\infty} \sum_{t^n_k \in D_n} {\left(x_{t^n_{k+1}} - x_{t^n_k}\right)}^p, Cont and Perkowski's change of variable formula states that for any f\in C^p(\mathbb{R}^d,\mathbb{R}): \begin{align} f(x_t) = {} & f(x_0)+\int_0^t \nabla_{p-1}f(x_{s-}) \, \mathrm{d}x_s + \frac{1}{p!}\int_{]0,t]} f^{p}(x_{s-}) \, d[x]^p_s \end{align} where the first integral is defined as a limit of compensated left Riemann sums along a sequence of partitions D_n: \begin{align} \int_0^t \nabla_{p-1}f(x_{s-}) \, \mathrm{d}x_s := {} & \sum_{t^n_k\in D_n} \sum_{k=1}^{p-1} \frac{f^{k}(x_{t_k^n})}{k!} \left(x_{t^n_{k+1}} - x_{t^n_k}\right)^k. \end{align} An extension to the case of fractional regularity (non-integer p) was obtained by Cont and Jin. == Infinite-dimensional formulas ==

Infinite-dimensional formulas

There exist some extensions to infinite-dimensional spaces (e.g. Pardoux, Gyöngy-Krylov, Brzezniak-van Neerven-Veraar-Weis). == See also ==

Source: Wikipedia ↗

tickerdossier.com tickerdossier.substack.com