MarketSterbenz lemma
Company Profile

Sterbenz lemma

In floating-point arithmetic, the Sterbenz lemma or Sterbenz's lemma is a theorem giving conditions under which floating-point differences are computed exactly. It is named after Pat H. Sterbenz, who published a variant of it in 1974.

Proof
Let \beta be the radix of the floating-point system and p the precision. Consider several easy cases first: • If x is zero then x - y = -y, and if y is zero then x - y = x, so the result is trivial because floating-point negation is always exact. • If x = y the result is zero and thus exact. • If x then we must also have y/2 \leq x so y . In this case, x - y = -(-x - -y), so the result follows from the theorem restricted to x, y \geq 0. • If x \leq y, we can write x - y = -(y - x) with x/2 \leq y \leq 2 x, so the result follows from the theorem restricted to x \geq y. For the rest of the proof, assume 0 without loss of generality. Write x, y > 0 in terms of their positive integral significands s_x, s_y \leq \beta^p - 1 and minimal exponents e_x, e_y: \begin{align} x &= s_x \cdot \beta^{e_x - p + 1} \\ y &= s_y \cdot \beta^{e_y - p + 1} \end{align} Note that x and y may be subnormal—we do not assume s_x, s_y \geq \beta^{p - 1}. The subtraction gives: \begin{align} x - y &= s_x \cdot \beta^{e_x - p + 1} - s_y \cdot \beta^{e_y - p + 1} \\ &= s_x \beta^{e_x - e_y} \cdot \beta^{e_y - p + 1} - s_y \cdot \beta^{e_y - p + 1} \\ &= (s_x \beta^{e_x - e_y} - s_y) \cdot \beta^{e_y - p + 1}. \end{align} Let s' = s_x \beta^{e_x - e_y} - s_y. Since 0 we have: • e_y \leq e_x, so e_x - e_y \geq 0, from which we can conclude \beta^{e_x - e_y} is an integer and therefore so is s' = s_x \beta^{e_x - e_y} - s_y; and • x - y > 0, so s' > 0. Further, since x \leq 2 y, we have x - y \leq y, so that s' \cdot \beta^{e_y - p + 1} = x - y \leq y = s_y \cdot \beta^{e_y - p + 1} which implies that 0 Hence x - y = s' \cdot \beta^{e_y - p + 1}, \quad \text{for} \quad 0 so x - y is a floating-point number. Note: Even if x and y are normal, i.e., s_x, s_y \geq \beta^{p - 1}, we cannot prove that s' \geq \beta^{p - 1} and therefore cannot prove that x - y is also normal. For example, the difference of the two smallest positive normal floating-point numbers x = (\beta^{p - 1} + 1) \cdot \beta^{e_{\mathrm{min}} - p + 1} and y = \beta^{p - 1} \cdot \beta^{e_{\mathrm{min}} - p + 1} is x - y = 1 \cdot \beta^{e_{\mathrm{min}} - p + 1} which is necessarily subnormal. In floating-point number systems without subnormal numbers, such as CPUs in nonstandard flush-to-zero mode instead of the standard gradual underflow, the Sterbenz lemma does not apply. == Relation to catastrophic cancellation ==
Relation to catastrophic cancellation
The Sterbenz lemma may be contrasted with the phenomenon of catastrophic cancellation: • The Sterbenz lemma asserts that if x and y are sufficiently close floating-point numbers then their difference x - y is computed exactly by floating-point arithmetic x \ominus y = \operatorname{fl}(x - y), with no rounding needed. • The phenomenon of catastrophic cancellation is that if \tilde x and \tilde y are approximations to true numbers x and y—whether the approximations arise from prior rounding error or from series truncation or from physical uncertainty or anything else—the error of the difference \tilde x - \tilde y from the desired difference x - y is inversely proportional to x - y. Thus, the closer x and y are, the worse \tilde x - \tilde y may be as an approximation to x - y, even if the subtraction itself is computed exactly. In other words, the Sterbenz lemma shows that subtracting nearby floating-point numbers is exact, but if the numbers one has are approximations then even their exact difference may be far off from the difference of numbers one wanted to subtract. == Use in numerical analysis ==
Use in numerical analysis
The Sterbenz lemma is instrumental in proving theorems on error bounds in numerical analysis of floating-point algorithms. For example, Heron's formula A = \sqrt{s (s - a) (s - b) (s - c)} for the area of triangle with side lengths a, b, and c, where s = (a + b + c)/2 is the semi-perimeter, may give poor accuracy for long narrow triangles if evaluated directly in floating-point arithmetic. However, for a \geq b \geq c, the alternative formula A = \frac{1}{4} \sqrt{\bigl(a + (b + c)\bigr) \bigl(c - (a - b)\bigr) \bigl(c + (a - b)\bigr) \bigl(a + (b - c)\bigr)} can be proven, with the help of the Sterbenz lemma, to have low forward error for all inputs. == References ==
tickerdossier.comtickerdossier.substack.com