Vector-valued functions A
vector-valued function \mathbf{y} of a real variable sends real numbers to vectors in some
vector space . A vector-valued function can be split up into its coordinate functions , meaning that {{tmath| \mathbf{y} = (y_1(t), y_2(t), \dots, y_n(t)) }}. This includes, for example,
parametric curves in \R^2 or . The coordinate functions are real-valued functions, so the above definition of derivative applies to them. The derivative of \mathbf{y}(t) is defined to be the
vector, called the
tangent vector, whose coordinates are the derivatives of the coordinate functions. That is, \mathbf{y}'(t)=\lim_{h\to 0}\frac{\mathbf{y}(t+h) - \mathbf{y}(t)}{h}, if the limit exists. The subtraction in the numerator is the subtraction of vectors, not scalars. If the derivative of \mathbf{y} exists for every value of , then \mathbf{y}' is another vector-valued function.
Partial derivatives Functions can depend upon
more than one variable. A
partial derivative of a function of several variables is its derivative with respect to one of those variables, with the others held constant. Partial derivatives are used in
vector calculus and
differential geometry. As with ordinary derivatives, multiple notations exist: the partial derivative of a function f(x, y, \dots) with respect to the variable x is variously denoted by {{block indent | em = 1.2 | text = , , , {{tmath| \frac{\partial}{\partial x}f }}, or {{tmath| \frac{\partial f}{\partial x} }},}} among other possibilities. It can be thought of as the rate of change of the function in the x-direction. Here
∂ is a rounded
d called the
partial derivative symbol. To distinguish it from the letter
d, ∂ is sometimes pronounced "der", "del", or "partial" instead of "dee". For example, let , then the partial derivative of function f with respect to both variables x and y are, respectively: \frac{\partial f}{\partial x} = 2x + y, \qquad \frac{\partial f}{\partial y} = x + 2y. In general, the partial derivative of a function f(x_1, \dots, x_n) in the direction x_i at the point (a_1, \dots, a_n) is defined to be: \frac{\partial f}{\partial x_i}(a_1,\ldots,a_n) = \lim_{h \to 0}\frac{f(a_1,\ldots,a_i+h,\ldots,a_n) - f(a_1,\ldots,a_i,\ldots,a_n)}{h}. This is fundamental for the study of the
functions of several real variables. Let f(x_1, \dots, x_n) be such a
real-valued function. If all partial derivatives f with respect to x_j are defined at the point , these partial derivatives define the vector \nabla f(a_1, \ldots, a_n) = \left(\frac{\partial f}{\partial x_1}(a_1, \ldots, a_n), \ldots, \frac{\partial f}{\partial x_n}(a_1, \ldots, a_n)\right), which is called the
gradient of f at . If f is differentiable at every point in some domain, then the gradient is a
vector-valued function \nabla f that maps the point (a_1, \dots, a_n) to the vector . Consequently, the gradient determines a
vector field.
Directional derivatives If f is a real-valued function on , then the partial derivatives of f measure its variation in the direction of the coordinate axes. For example, if f is a function of x and , then its partial derivatives measure the variation in f in the x and y direction. However, they do not directly measure the variation of f in any other direction, such as along the diagonal line . These are measured using directional derivatives. Given a vector {{tmath|1= \mathbf{v} = (v_1,\ldots,v_n) }}, then the
directional derivative of f in the direction of \mathbf{v} at the point \mathbf{x} is: D_{\mathbf{v}}{f}(\mathbf{x}) = \lim_{h \rightarrow 0}{\frac{f(\mathbf{x} + h\mathbf{v}) - f(\mathbf{x})}{h}}. \frac{f(\mathbf{x} + (k/\lambda)(\lambda\mathbf{u})) - f(\mathbf{x})}{k/\lambda} = \lambda\cdot\frac{f(\mathbf{x} + k\mathbf{u}) - f(\mathbf{x})}{k}. This is
λ times the difference quotient for the directional derivative of
f with respect to
u. Furthermore, taking the limit as
h tends to zero is the same as taking the limit as
k tends to zero because
h and
k are multiples of each other. Therefore, . Because of this rescaling property, directional derivatives are frequently considered only for unit vectors.--> If all the partial derivatives of f exist and are continuous at {{tmath|1= \mathbf{x} }}, then they determine the directional derivative of f in the direction \mathbf{v} by the formula: D_{\mathbf{v}}{f}(\mathbf{x}) = \sum_{j=1}^n v_j \frac{\partial f}{\partial x_j}.
Total derivative and Jacobian matrix When f is a function from an open subset of \R^n to , then the directional derivative of f in a chosen direction is the best linear approximation to f at that point and in that direction. However, when , no single directional derivative can give a complete picture of the behavior of . The total derivative gives a complete picture by considering all directions at once. That is, for any vector \mathbf{v} starting at {{tmath|1= \mathbf{a} }}, the linear approximation formula holds: f(\mathbf{a} + \mathbf{v}) \approx f(\mathbf{a}) + f'(\mathbf{a})\mathbf{v}. Similarly with the single-variable derivative, f'(\mathbf{a}) is chosen so that the error in this approximation is as small as possible. The total derivative of f at \mathbf{a} is the unique linear transformation f'(\mathbf{a}) \colon \R^n \to \R^m such that \lim_{\mathbf{h}\to 0} \frac{\lVert f(\mathbf{a} + \mathbf{h}) - (f(\mathbf{a}) + f'(\mathbf{a})\mathbf{h})\rVert}{\lVert\mathbf{h}\rVert} = 0. Here \mathbf{h} is a vector in , so the norm in the denominator is the standard length on . However, f'(\mathbf{a}) \mathbf{h} is a vector in , and the norm in the numerator is the standard length on . If v is a vector starting at , then f'(\mathbf{a}) \mathbf{v} is called the
pushforward of \mathbf{v} by . If the total derivative exists at {{tmath|1= \mathbf{a} }}, then all the partial derivatives and directional derivatives of f exist at {{tmath|1= \mathbf{a} }}, and for all {{tmath|1= \mathbf{v} }}, f'(\mathbf{a})\mathbf{v} is the directional derivative of f in the direction {{tmath|1= \mathbf{v} }}. If f is written using coordinate functions, so that , then the total derivative can be expressed using the partial derivatives as a
matrix. This matrix is called the
Jacobian matrix of f at \mathbf{a} : f'(\mathbf{a}) = \operatorname{Jac}_{\mathbf{a}} = \left(\frac{\partial f_i}{\partial x_j}\right)_{ij}. f(\mathbf{a}) is strictly stronger than the existence of all the partial derivatives, but if the partial derivatives exist and are continuous, then the total derivative exists, is given by the Jacobian, and depends continuously on {{tmath|1= \mathbf{a} }}. The definition of the total derivative subsumes the definition of the derivative in one variable. That is, if f is a real-valued function of a real variable, then the total derivative exists if and only if the usual derivative exists. The Jacobian matrix reduces to a matrix whose only entry is the derivative . This matrix satisfies the property that f(a+h) \approx f(a) + f'(a)h. Up to changing variables, this is the statement that the function x \mapsto f(a) + f'(a)(x-a) is the best linear approximation to f at . The total derivative of a function does not give another function in the same way as the one-variable case. This is because the total derivative of a multivariable function has to record much more information than the derivative of a single-variable function. Instead, the total derivative gives a function from the
tangent bundle of the source to the tangent bundle of the target. The natural analog of second, third, and higher-order total derivatives is not a linear transformation, is not a function on the tangent bundle, and is not built by repeatedly taking the total derivative. The analog of a higher-order derivative, called a
jet, cannot be a linear transformation because higher-order derivatives reflect subtle geometric information, such as concavity, which cannot be described in terms of linear data such as vectors. It cannot be a function on the tangent bundle because the tangent bundle only has room for the base space and the directional derivatives. Because jets capture higher-order information, they take as arguments additional coordinates representing higher-order changes in direction. The space determined by these additional coordinates is called the
jet bundle. The relation between the total derivative and the partial derivatives of a function is paralleled in the relation between the th order jet of a function and its partial derivatives of order less than or equal to . By repeatedly taking the total derivative, one obtains higher versions of the
Fréchet derivative, specialized to . The th order total derivative may be interpreted as a map D^k f: \mathbb{R}^n \to L^k(\mathbb{R}^n \times \cdots \times \mathbb{R}^n, \mathbb{R}^m), which takes a point \mathbf{x} in \R^n and assigns to it an element of the space of linear maps from \R^n to \R^m — the "best" (in a certain precise sense) linear approximation to f at that point. By precomposing it with the
diagonal map , {{tmath| \mathbf{x} \to (\mathbf{x}, \mathbf{x}) }}, a generalized Taylor series may be begun as \begin{align} f(\mathbf{x}) & \approx f(\mathbf{a}) + (D f)(\mathbf{x-a}) + \left(D^2 f\right)(\Delta(\mathbf{x-a})) + \cdots\\ & = f(\mathbf{a}) + (D f)(\mathbf{x - a}) + \left(D^2 f\right)(\mathbf{x - a}, \mathbf{x - a})+ \cdots\\ & = f(\mathbf{a}) + \sum_i (D f)_i (x_i-a_i) + \sum_{j, k} \left(D^2 f\right)_{j k} (x_j-a_j) (x_k-a_k) + \cdots \end{align} where f(\mathbf{a}) is identified with a constant function, x_i - a_i are the components of the vector {{tmath| \mathbf{x}- \mathbf{a} }}, and (Df)_i and (D^2 f)_{jk} are the components of Df and D^2 f as linear transformations.--> == Generalizations ==