Derivative The
derivative of the
binary entropy function may be expressed as the negative of the
logit function: : {d \over dp} \operatorname H_\text{b}(p) = - \operatorname{logit}_a(p) = -\log_a\left( \frac{p}{1-p} \right). : {d^2 \over dp^2} \operatorname H_\text{b}(p) = - \frac{1}{p(1-p) \ln a}\, , where denotes the given base of the logarithm.
Convex conjugate The
convex conjugate (specifically, the
Legendre transform) of the binary entropy (with base ) is the negative
softplus function. This is because (following the definition of the Legendre transform: the derivatives are inverse functions) the derivative of negative binary entropy is the logit, whose inverse function is the
logistic function, which is the derivative of softplus. Softplus can be interpreted as
logistic loss, so by
duality, minimizing logistic loss corresponds to maximizing entropy. This justifies the
principle of maximum entropy as loss minimization.
Taylor series The
Taylor series of the binary entropy function at 1/2 is :\operatorname H_\text{b}(p) = 1 - \frac{1}{2\ln 2} \sum^{\infin}_{n=1} \frac{(1-2p)^{2n}}{n(2n-1)} which converges to the binary entropy function for all values 0\le p\le 1.
Bounds The following bounds hold for 0 : :\ln(2) \cdot \log_2(p) \cdot \log_2(1-p) \leq H_\text{b}(p) \leq \log_2(p) \cdot \log_2(1-p) and :4p(1-p) \leq H_\text{b}(p) \leq (4p(1-p))^{(1/\ln 4)} where \ln denotes natural logarithm. ==See also==