LogSumExp

The LogSumExp function domain is \R^n, the real coordinate space, and its codomain is \R, the real line. It is an approximation to the maximum \max_i x_i with the following bounds \max{\{x_1, \dots, x_n\}} \leq \mathrm{LSE}(x_1, \dots, x_n) \leq \max{\{x_1, \dots, x_n\}} + \log(n). The first inequality is strict unless n = 1. The second inequality is strict unless all arguments are equal. (Proof: Let m = \max_i x_i. Then \exp(m) \leq \sum_{i=1}^n \exp(x_i) \leq n \exp(m). Applying the logarithm to the inequality gives the result.) In addition, we can scale the function to make the bounds tighter. Consider the function \frac 1 t \mathrm{LSE}(tx_1, \dots, tx_n). Then \max{\{x_1, \dots, x_n\}} (Proof: Replace each x_i with tx_i for some t>0 in the inequalities above, to give \max{\{tx_1, \dots, tx_n\}} and, since t>0 t \max{\{x_1, \dots, x_n\}} finally, dividing by t gives the result.) Also, if we multiply by a negative number instead, we of course find a comparison to the \min function: \min{\{x_1, \dots, x_n\}} - \frac{\log(n)}{t} \leq \frac 1 {-t} \mathrm{LSE}(-tx) The LogSumExp function is convex, and is strictly increasing everywhere in its domain. It is not strictly convex, since it is affine (linear plus a constant) on the diagonal and parallel lines: :\mathrm{LSE}(x_1 + c, \dots, x_n + c) =\mathrm{LSE}(x_1, \dots, x_n) + c. Other than this direction, it is strictly convex (the Hessian has rank ), so for example restricting to a hyperplane that is transverse to the diagonal results in a strictly convex function. See \mathrm{LSE}_0^+, below. Writing \mathbf{x} = (x_1, \dots, x_n), the partial derivatives are: \frac{\partial}{\partial x_i}{\mathrm{LSE}(\mathbf{x})} = \frac{\exp x_i}{\sum_j \exp {x_j}}, which means the gradient of LogSumExp is the softmax function. The convex conjugate of LogSumExp is the negative entropy. ==log-sum-exp trick for log-domain calculations==