There are several ways to arrive at the correct expression for four-momentum. One way is to first define the four-velocity and simply define , being content that it is a four-vector with the correct units and correct behavior. Another, more satisfactory, approach is to begin with the
principle of least action and use the
Lagrangian framework to derive the four-momentum, including the expression for the energy. One may at once, using the observations detailed below, define four-momentum from the
action . Given that in general for a closed system with
generalized coordinates and
canonical momenta ,p_i = \frac{\partial S}{\partial q_i} = \frac{\partial S}{\partial x_i}, \quad E = -\frac{\partial S}{\partial t} = - c \cdot \frac{\partial S}{\partial x^{0}},it is immediate (recalling , , , and , , , in the present metric convention) thatp_\mu =\frac{\partial S}{\partial x^\mu} = \left(-{E \over c}, \mathbf p\right) is a covariant four-vector with the three-vector part being the canonical momentum. Consider initially a system of one degree of freedom . In the derivation of the
equations of motion from the action using
Hamilton's principle, one finds (generally) in an intermediate stage for the
variation of the action, \delta S = \left. \left[ \frac{\partial L}{\partial \dot q}\delta q\right]\right|_{t_1}^{t_2} + \int_{t_1}^{t_2} \left( \frac{\partial L}{\partial q} - \frac{d}{dt} \frac{\partial L}{\partial \dot q}\right)\delta q dt. The assumption is then that the varied paths satisfy , from which
Lagrange's equations follow at once. When the equations of motion are known (or simply assumed to be satisfied), one may let go of the requirement . In this case the path is
assumed to satisfy the equations of motion, and the action is a function of the upper integration limit , but is still fixed. The above equation becomes with , and defining , and letting in more degrees of freedom, \delta S = \sum_i \frac{\partial L}{\partial \dot{q}_i}\delta q_i = \sum_i p_i \delta q_i. Observing that \delta S = \sum_i \frac{\partial S}{\partial {q}_i}\delta q_i, one concludes p_i = \frac{\partial S}{\partial q_i}. In a similar fashion, keep endpoints fixed, but let vary. This time, the system is allowed to move through configuration space at "arbitrary speed" or with "more or less energy", the field equations still assumed to hold and variation can be carried out on the integral, but instead observe \frac{dS}{dt} = L by the
fundamental theorem of calculus. Compute using the above expression for canonical momenta, \frac{dS}{dt} = \frac{\partial S}{\partial t} + \sum_i \frac{\partial S}{\partial q_i}\dot{q}_i = \frac{\partial S}{\partial t} + \sum_i p_i\dot{q}_i = L. Now using H = \sum_i p_i \dot{q}_i - L, where is the
Hamiltonian, leads to, since in the present case, E = H = -\frac{\partial S}{\partial t}. Incidentally, using with in the above equation yields the
Hamilton–Jacobi equations. In this context, is called
Hamilton's principal function. ---- The action is given by S = -mc\int ds = \int L dt, \quad L = -mc^2\sqrt{1 - \frac{v^2}{c^2}}, where is the relativistic
Lagrangian for a free particle. From this, The variation of the action is \delta S = -mc\int \delta ds. To calculate , observe first that and that \delta ds^2 = \delta \eta_{\mu\nu}dx^\mu dx^\nu = \eta_{\mu\nu} \left(\delta \left(dx^\mu\right) dx^\nu + dx^\mu \delta \left(dx^\nu\right)\right) = 2\eta_{\mu\nu} \delta \left(dx^\mu\right) dx^\nu. So \delta ds = \eta_{\mu\nu} \delta dx^\mu \frac{dx^\nu}{ds} = \eta_{\mu\nu} d\delta x^\mu \frac{dx^\nu}{ds}, or \delta ds = \eta_{\mu\nu} \frac{d\delta x^\mu}{d\tau} \frac{dx^\nu}{cd\tau}d\tau, and thus \delta S = -m\int \eta_{\mu\nu} \frac{d\delta x^\mu}{d\tau} \frac{dx^\nu}{d\tau}d\tau = -m\int \eta_{\mu\nu} \frac{d\delta x^\mu}{d\tau} u^\nu d\tau = -m\int \eta_{\mu\nu} \left[\frac{d}{d\tau} \left(\delta x^\mu u^\nu\right) - \delta x^\mu\frac{d}{d\tau}u^\nu\right] d\tau which is just \delta S = \left[-mu_\mu\delta x^\mu\right]_{t_1}^{t_2} + m \int_{t_1}^{t_2} \delta x^\mu\frac{du_\mu}{ds}ds ---- \delta S = \left[ -mu_\mu\delta x^\mu\right]_{t_1}^{t_2} + m\int_{t_1}^{t_2}\delta x^\mu\frac{du_\mu}{ds}ds = -mu_\mu\delta x^\mu = \frac{\partial S}{\partial x^\mu}\delta x^\mu = -p_\mu\delta x^\mu, where the second step employs the field equations , , and as in the observations above. Now compare the last three expressions to find p^\mu = \partial^\mu[S] = \frac{\partial S}{\partial x_\mu} = mu^\mu = m\left(\frac{c}{\sqrt{1 - \frac{v^2}{c^2}}}, \frac{v_x}{\sqrt{1 - \frac{v^2}{c^2}}}, \frac{v_y}{\sqrt{1 - \frac{v^2}{c^2}}}, \frac{v_z}{\sqrt{1 - \frac{v^2}{c^2}}}\right), with norm , and the famed result for the relativistic energy, {{Equation box 1 |indent =: |title= |equation = E = \frac{mc^2}{\sqrt{1 - \frac{v^2}{c^2}}} = m_{r}c^2, |cellpadding = 6 |border |border colour = #0073CF |bgcolor = #F9FFF7 }} where is the now unfashionable
relativistic mass, follows. By comparing the expressions for momentum and energy directly, one has {{Equation box 1 |indent =: |title= |equation = \mathbf p = E\frac{\mathbf v}{c^2}, |cellpadding= 6 |border |border colour = #0073CF |bgcolor = #F9FFF7 }} that holds for massless particles as well. Squaring the expressions for energy and three-momentum and relating them gives the
energy–momentum relation, {{Equation box 1 |indent =: |title= |equation = \frac{E^2}{c^2} = \mathbf p \cdot \mathbf p + m^2c^2. |cellpadding= 6 |border |border colour = #0073CF |bgcolor=#F9FFF7 }} Substituting p_\mu \leftrightarrow -\frac{\partial S}{\partial x^\mu} in the equation for the norm gives the relativistic
Hamilton–Jacobi equation, {{Equation box 1 |indent =: |title= |equation = \eta^{\mu\nu}\frac{\partial S}{\partial x^\mu}\frac{\partial S}{\partial x^\nu} = -m^2c^2. |cellpadding= 6 |border |border colour = #0073CF |bgcolor=#F9FFF7 }} It is also possible to derive the results from the Lagrangian directly. By definition, \begin{align} \mathbf p &= \frac{\partial L}{\partial \mathbf v} = \left({\partial L\over \partial \dot x}, {\partial L\over\partial \dot y}, {\partial L\over\partial \dot z}\right) = m(\gamma v_x, \gamma v_y, \gamma v_z) = m\gamma \mathbf v = m \mathbf u , \\[3pt] E &= \mathbf p \cdot \mathbf v - L = \frac{mc^2}{\sqrt{1 - \frac{v^2}{c^2}}}, \end{align} which constitute the standard formulae for canonical momentum and energy of a closed (time-independent Lagrangian) system. With this approach it is less clear that the energy and momentum are parts of a four-vector. The energy and the three-momentum are
separately conserved quantities for isolated systems in the Lagrangian framework. Hence four-momentum is conserved as well. More on this below. More pedestrian approaches include expected behavior in electrodynamics. In this approach, the starting point is application of
Lorentz force law and
Newton's second law in the rest frame of the particle. The transformation properties of the electromagnetic field tensor, including invariance of
electric charge, are then used to transform to the lab frame, and the resulting expression (again Lorentz force law) is interpreted in the spirit of Newton's second law, leading to the correct expression for the relativistic three-momentum. The disadvantage, of course, is that it isn't immediately clear that the result applies to all particles, whether charged or not, and that it doesn't yield the complete four-vector. It is also possible to avoid electromagnetism and use well tuned experiments of thought involving well-trained physicists throwing billiard balls, utilizing knowledge of the
velocity addition formula and assuming conservation of momentum. This too gives only the three-vector part. == Conservation of four-momentum ==