Big O notation

Big O notation is a mathematical notation that describes the approximate size of a function on a domain. Big O is a member of a family of notations invented by German mathematicians Paul Bachmann and Edmund Landau and expanded by others, collectively called Bachmann–Landau notation. The letter O was chosen by Bachmann to stand for Ordnung, meaning the order of approximation.

Formal definition

Let f, the function to be estimated, be either a real or complex valued function defined on a domain D, and let g, the comparison function, be a non-negative real valued function defined on the same set D. Common choices for the domain are intervals of real numbers, bounded or unbounded, the set of positive integers, the set of complex numbers and tuples of real/complex numbers. With the domain written explicitly or understood implicitly, one writes : f(x) = O\bigl(g(x)\bigr)\ which is read as is of if there exists a positive real number M such that :\left| f(x) \right|\le M\ g(x) \qquad ~ \mathsf{\ for\ all\ } ~ \quad x \in D. If g(x) > 0 (i.e. is also never zero) throughout the domain D, an equivalent definition is that the ratio \frac{f(x)}{g(x)} is bounded, i.e. there is a positive real number M so that \Big|\frac{f(x)}{g(x)}\Big| \le M for all x \in D. These encompass all the uses of in computer science and mathematics, including its use where the domain is finite, infinite, real, complex, single variate, or multivariate. In most applications, one chooses the function g(x) appearing within the argument of O\bigl( \cdot \bigr) to be as simple a form as possible, omitting constant factors and lower order terms. The number M is called the implied constant because it is normally not specified. When using notation, what matters is that some finite M exists, not its specific value. This simplifies the presentation of many analytic inequalities. For functions defined on positive real numbers or positive integers, a more restrictive and somewhat conflicting definition is still in common use, especially in computer science. When restricted to functions which are eventually positive, the notation : f(x) =O\bigl(g(x)\bigr) \qquad ~ \mathsf{ as } \quad x \to \infty means that for some real number a, f(x) = O\bigl(g(x)\bigr) in the domain \left[a,\infty\right). Here, the expression x \to \infty does not indicate a limit, but the notion that the inequality holds for large enough x. The expression x \to \infty often is omitted. the Russian number theorist introduced the notation \ll, which has been increasingly used in number theory and other branches of mathematics, as an alternative to the O notation. We have :\ f \ll g \iff f = O\bigl(g\bigr). Frequently both notations are used in the same work. Set version of big O In computer science it is common to define as also defining a set of functions. With the positive (or non-negative) function g(x) specified, one interprets O\bigl(g(x)\bigr) as representing the set of all functions \tilde f that satisfy \tilde f(x) = O\bigl(g(x)\bigr). One can then equivalently write f(x) \in O\bigl(g(x)\bigr), read as "the function \ f(x)\ is among the set of all functions of == Examples with an infinite domain ==

Examples with an infinite domain

In typical usage the O notation is applied to an infinite interval of real numbers [a,\infty) and captures the behavior of the function for very large x . In this setting, the contribution of the terms that grow "most quickly" will eventually make the other ones irrelevant. As a result, the following simplification rules can be applied: • If f(x) is a sum of several terms, if there is one with largest growth rate, it can be kept, and all others omitted. • If f(x) is a product of several factors, any constants (factors in the product that do not depend on x ) can be omitted. For example, let f(x)=6x^4-2x^3+5 , and suppose we wish to simplify this function, using O notation, to describe its growth rate for large x . This function is the sum of three terms: 6x^4 , -2x^3 , and 5 . Of these three terms, the one with the highest growth rate is the one with the largest exponent as a function of x , namely 6x^4 . Now one may apply the second rule: 6x^4 is a product of 6 and x^4 in which the first factor does not depend on x . Omitting this factor results in the simplified form x^4 . Thus, we say that f(x) is a "big O" of x^4 . Mathematically, we can write f(x)=O(x^4) for all x\ge 1. One may confirm this calculation using the formal definition: let f(x)=6x^4-2x^3+5 and g(x)=x^4 . Applying the formal definition from above, the statement that f(x)=O(x^4) is equivalent to its expansion, |f(x)| \le M x^4 for some suitable choice of a positive real number M and for all x \ge 1 . To prove this, let M=13 . Then, for all x\ge 1 : \begin{align} &\le 6x^4 + 2x^4 + 5x^4\\ &= 13x^4 \end{align} so |6x^4 - 2x^3 + 5| \le 13 x^4 . While it is also true, by the same argument, that f(x)=O(x^{10}), this is a less precise approximation of the function f. On the other hand, the statement f(x)=O(x^3) is false, because the term 6x^4 causes f(x)/x^3 to be unbounded. When a function T(n) describes the number of steps required in an algorithm with input n, an expression such as T(n)=O(n^2) with the implied domain being the set of positive integers, may be interpreted as saying that the algorithm has at most the order of n^2 time complexity. == Example with a finite domain ==

Example with a finite domain

Big O can also be used to describe the error term in an approximation to a mathematical function on a finite interval. The most significant terms are written explicitly, and then the least-significant terms are summarized in a single big O term. Consider, for example, the exponential series and two expressions of it that are valid when x is small: \begin{align} e^x &= 1 + x + \frac{\; x^2\ } {2! }+\frac{\; x^3\ }{3!}+\frac{\; x^4\ }{4!} + \dotsb && \text{ for all finite } x\\[4pt] &= 1 + x + \frac{\; x^2\ }{ 2 }+O(|x|^3) && \text{ for all } |x|\le 1 \\[4pt] &= 1 + x + O(x^2) && \text{ for all } |x|\le 1. \end{align} The middle expression the line with means the absolute-value of the error \ e^x- (1 + x + \frac{\; x^2\ }{2} )\ is at most some constant times ~ |x^3|\ when \ x ~ is small. This is an example of the use of Taylor's theorem. The behavior of a given function may be very different on finite domains than on infinite domains, for example, (x+1)^8 = x^8 + O(x^7) \quad \text{ for } x\ge 1 while (x+1)^8 = 1 + 8x + O(x^2) \quad \text{ for } |x|\le 1. == Multivariate examples ==

Multivariate examples

x \sin y = O(x) \quad \text{ for }x\ge 1,y\text{ any real number} 3a^2+7ab+2b^2+a+3b+14 \ll a^2+b^2 \ll a^2 \quad \text{ for all } a\ge b\ge 1 \frac{xy}{x^2+y^2} = O(1) \quad \text{ for all real } x,y \text{ that are not both } 0 x^{it} = O(1) \quad \text{ for } x\ne 0,t\in \mathbb{R}. Here we have a complex variable function of two variables. In general, any bounded function is O(1) . (x+y)^{10} = O(x^{10}) \quad \text{ for }x\ge 1, -2\le y\le 2. The last example illustrates a mixing of finite and infinite domains on the different variables. In all of these examples, the bound is uniform in both variables. Sometimes in a multivariate expression, one variable is more important than others, and one may express that the implied constant M depends on one or more of the variables using subscripts to the big O symbol or the \ll symbol. For example, consider the expression (1+x)^b = 1 + O_b(x) \quad \text{ for } 0 \le x\le 1, b\text{ any real number.} This means that for each real number b, there is a constant M_b, which depends on b, so that for all 0\le x\le 1, |(1+x)^b-1| \le M_b \cdot x. This particular statement follows from the general binomial theorem. Another example, common in the theory of Taylor series, is e^x = 1 + x + O_r(x^2) \quad \text{ for all } |x|\le r, r\text{ being any real number.} Here the implied constant depends on the size of the domain. The subscript convention applies to all of the other notations in this page. == Properties ==

Properties

Product : f_1 = O(g_1) \text{ and } f_2 = O(g_2) \Rightarrow f_1 f_2 = O(g_1 g_2) :f\cdot O(g) = O(|f| g) Sum If f_1 = O(g_1) and f_2= O(g_2) then f_1 + f_2 = O(\max(g_1, g_2)). It follows that if f_1 = O(g) and f_2 = O(g) then f_1+f_2 = O(g) . Multiplication by a constant Let be a nonzero constant. Then O(|k| \cdot g) = O(g). In other words, if f = O(g), then k \cdot f = O(g). Transitive property If f=O(g) and g=O(h) then f=O(h) . If the function f of a positive integer n can be written as a finite sum of other functions, then the fastest growing one determines the order of f(n). For example, :f(n) = 9 \log n + 5 (\log n)^4 + 3n^2 + 2n^3 = O(n^3) \qquad\text{for } n\ge 1 . Some general rules about growth toward infinity; the 2nd and 3rd property below can be proved rigorously using L'Hôpital's rule: Large powers dominate small powers For b\ge a, then n^a = O(n^b) as n \to \infty . Powers dominate logarithms For any positive a,b, (\log n)^a = O_{a,b}(n^b), no matter how large a is and how small b is. Here, the implied constant depends on both a and b. Exponentials dominate powers For any positive a,b, n^a = O_{a,b}(e^{bn}), no matter how large a is and how small b is. A function that grows faster than n^c for any c is called superpolynomial. One that grows more slowly than any exponential function of the form c^n with c>1 is called subexponential. An algorithm can require time that is both superpolynomial and subexponential; examples of this include the fastest known algorithms for integer factorization and the function n^{\log n} . We may ignore any powers of n inside of the logarithms. For any positive c, the notation O(\log n) means exactly the same thing as O(\log (n^c)), since \log(n^c)=c\log n. Similarly, logs with different constant bases are equivalent with respect to Big O notation. On the other hand, exponentials with different bases are not of the same order. For example, 2^n and 3^n are not of the same order. More complicated expressions In more complicated usage, O(\cdot) can appear in different places in an equation, even several times on each side. For example, the following are true for n a positive integer: \begin{align} (n+1)^2 & = n^2 + O(n), \\ (n + O(n^{1/2})) \cdot (n + O(\log n))^2 & = n^3 + O(n^{5/2}), \\ n^{O(1)} & = O(e^n). \end{align} The meaning of such statements is as follows: for any functions which satisfy each O(\cdot) on the left side, there are some functions satisfying each O(\cdot) on the right side, such that substituting all these functions into the equation makes the two sides equal. For example, the third equation above means: "For any function satisfying f(n)=O(1), there is some function g(n)=O(e^n) such that n^{f(n)}=g(n) ". The implied constant in the statement " g(n)=O(e^n) " may depend on the implied constant in the expression " f(n)=O(1)". Some further examples: \begin{align} f=O(g)\; &\Rightarrow\; \int_a^b f = O\bigg( \int_a^b g \bigg) \\ f(x)=g(x)+O(1)\; &\Rightarrow\; e^{f(x)}=O(e^{g(x)}) \\ (1+O(1/x))^{O(x)} &= O(1) \quad \text{ for } x>0\\ \sin x &= O(|x|) \quad \text{ for all real } x. \end{align} Vinogradov's ≫ and Knuth's big Ω When f, g are both positive functions, Vinogradov Knuth's big \Omega enjoys widespread use today in computer science and combinatorics. Hardy's ≍ and Knuth's big Θ In analytic number theory, the notation f(x) \asymp g(x) means both f(x)=O(g(x)) and g(x)=O(f(x)) . This notation is originally due to Hardy. Knuth's notation for the same notion is f(x) = \Theta(g(x)) . Roughly speaking, these statements assert that f(x) and g(x) have the same order. These notations mean that there are positive constants M,N so that N g(x) \le f(x) \le M g(x) for all x in the common domain of f,g . When the functions are defined on the positive integers or positive real numbers, as with big O, writers oftentimes interpret statements f(x) = \Omega(g(x)) and f(x)=\Theta(g(x)) as holding for all sufficiently large x , that is, for all x beyond some point x_0 . Sometimes this is indicated by appending x\to\infty to the statement. For example, 2n^2 - 10n = \Theta(n^2) is true for the domain n\ge 6 but false if the domain is all positive integers, since the function is zero at n=5 . Further examples n^3 + 20n^2 +n+12 \asymp n^3 \quad \text{ for all } n\ge 1. (1+x)^8 = x^8 + \Theta(x^7) \quad \text{ for all } x\ge 1. The notation f(n) = e^{\Omega(n)} \quad \text{ for all } n\ge 1, means that there is a positive constant M so that f(n) \ge e^{Mn} for all n\ge 1. By contrast, f(n) = e^{-O(n)} \quad \text{ for all } n\ge 1, means that there is a positive constant M so that f(n) \ge e^{-Mn} for all n\ge 1 and f(n) = e^{\Theta(n)} \quad \text{ for all } n\ge 1, means that there are positive constants M,N so that e^{M n} \le f(n) \le e^{N n} for all n\ge 1. For any domain D, f(x) = g(x)+O(1) \Longleftrightarrow e^{f(x)} \asymp e^{g(x)}, each statement being for all x in D. == Orders of common functions ==

Orders of common functions

Here is a list of classes of functions that are commonly encountered when analyzing the running time of an algorithm. In each case, c is a positive constant and n increases without bound. The slower-growing functions are generally listed first. The statement f(n) = O(n!) is sometimes weakened to f(n) = O\left(n^n\right) to derive simpler formulas for asymptotic complexity. In many of these examples, the running time is actually \Theta(g(n)) , which conveys more precision. ==Little-o notation== For real or complex-valued functions of a real variable x with g(x)>0 for sufficiently large x, one writes which is defined as follows: : f(x) = \Omega(g(x)) \quad as \quad x \to \infty \quad if \quad \limsup_{x \to \infty}\ \left|\frac{\ f(x)\ }{ g(x) }\right| > 0 ~. Thus ~ f(x) = \Omega(g(x)) ~ is the negation of ~ f(x) = o(g(x)) ~. In 1916 the same authors introduced the two new symbols \ \Omega_R\ and \ \Omega_L\ , defined as: : f(x) = \Omega_R(g(x)) \quad as \quad x \to \infty \quad if \quad \limsup_{x \to \infty}\ \frac{\ f(x)\ }{ g(x) } > 0\ ; : f(x)=\Omega_L(g(x)) \quad as \quad x \to \infty \quad if \quad ~ \liminf_{x \to \infty}\ \frac{\ f(x)\ }{ g(x) } These symbols were used by E. Landau, with the same meanings, in 1924. Authors that followed Landau, however, use a different notation for the same definitions: The small omega \omega notation is not used as often in analysis or in number theory. Quality of approximations using different notation Informally, especially in computer science, the big O notation often can be used somewhat differently to describe an asymptotic tight bound where using big Theta \Theta notation might be more factually appropriate in a given context . For example, when considering a function T(n)=73n^3+22n^2+58, all of the following are generally acceptable, but tighter bounds (such as numbers 2,3 and 4 below) are usually strongly preferred over looser bounds (such as number 1 below). • T(n)=O(n^{100}) • T(n)=O(n^{3}) • T(n)=\Theta(n^3) • T(n)\sim 73n^3 as n\to\infty. While all three statements are true, progressively more information is contained in each. In some fields, however, the big O notation (number 2 in the lists above) would be used more commonly than the big Theta notation (items numbered 3 in the lists above). For example, if T(n) represents the running time of a newly developed algorithm for input size n, the inventors and users of the algorithm might be more inclined to put an upper bound on how long it will take to run without making an explicit statement about the lower bound or asymptotic behavior. Extensions to the Bachmann–Landau notations Another notation sometimes used in computer science is \tilde{O} (read soft-O), which hides polylogarithmic factors. There are two definitions in use: some authors use f(n) = \tilde{O}(g(n)) as shorthand for f(n)=O(g(n)\log^k n) for some k, while others use it as shorthand for f(n)=O(g(n)\log^k g(n)) . When g(n) is polynomial in n, there is no difference; however, the latter definition allows one to say, e.g. that n2^n = \tilde O(2^n) while the former definition allows for \log^k n = \tilde O(1) for any constant k. Some authors write O* for the same purpose as the latter definition. Essentially, it is less precise version of the big O notation, ignoring logarithmic factors in the growth-rate of the function. Since \log^k n = o(n^\varepsilon) for any constant k and any \varepsilon>0, logarithmic factors are far less significant than powers of n and even more insignificant compared with exponentials. Also, the L notation, defined as :L_n[\alpha,c] = e^{(c + o(1))(\ln n)^\alpha(\ln\ln n)^{1-\alpha}}, is convenient for functions that are between polynomial and exponential in terms of \log n. == Generalizations and related usages ==

Generalizations and related usages

The generalization to functions taking values in any normed vector space is straightforward (replacing absolute values by norms), where f and g need not take their values in the same space. A generalization to functions g taking values in any topological group is also possible. The "limiting process" x\to x_0 can also be generalized by introducing an arbitrary filter base, i.e. to directed nets f and g. The o notation can be used to define derivatives and differentiability in quite general spaces, and also (asymptotical) equivalence of functions, : f\sim g \iff (f-g) \in o(g) which is an equivalence relation and a more restrictive notion than the relationship "f is \Theta(g)" from above. (It reduces to \lim f/g = 1 if f and g are positive real valued functions.) For example, 2x=\Theta(x) is, but 2x-x \ne o(x) . == History ==

History

In 1870, Paul du Bois-Reymond defined f(x) \succ \phi(x) , f(x) \sim \phi(x) and f(x) \prec \phi(x) to mean, respectively, \lim_{x\to\infty}\frac{f(x)}{\phi(x)}=\infty, \quad \lim_{x\to\infty}\frac{f(x)}{\phi(x)}>0, \quad \lim_{x\to\infty}\frac{f(x)}{\phi(x)}=0. These were not widely adopted and are not used today. The first and third are symmetric: f(x) \prec \phi(x) means the same as \phi(x) \succ f(x). Landau later adopted \sim with the narrower definition that the limit of f(x)/\phi(x) equals 1. None of these notations is in use today. The symbol O was first introduced by the number theorist Paul Bachmann in 1894, in the second volume of his book Analytische Zahlentheorie ("analytic number theory"). The number theorist Edmund Landau adopted it, and was thus inspired to introduce in 1909 the notation o; The symbol \Omega (in the sense "is not an o of") was introduced in 1914 by Hardy and Littlewood. Hardy introduced the symbols \preccurlyeq and advocated for Bois-Reymond's \prec (as well as the already mentioned other symbols) in his 1910 tract "Orders of Infinity", Hardy's symbols \preccurlyeq and \mathbin{\,\asymp\;\;\;\;\!\!\!\!\!\!\!\!\!\!\!\!\!-} are not used any more. The symbol \sim, although it had been used before with different meanings, Hardy also proposed the symbol \mathbin{\,\asymp\;\;\;\;\!\!\!\!\!\!\!\!\!\!\!\!\!-}, where f \mathbin{\,\asymp\;\;\;\;\!\!\!\!\!\!\!\!\!\!\!\!\!-} g means that f\sim Kg for some constant K\not=0 (this corresponds to Bois-Reymond's notation f\sim g). In the 1930s, Vinogradov popularized the notation f(x) \ll g(x) and g(x) \gg f(x), both of which mean f(x)=O(g(x)) . This notation became standard in analytic number theory. In the 1970s, big O was popularized in computer science by Donald Knuth, who proposed the different notation f(x)=\Theta(g(x)) for Hardy's f(x)\asymp g(x), and proposed a different definition for the Hardy and Littlewood Omega notation. == Matters of notation ==

Matters of notation

Arrows In mathematics, an expression such as x\to\infty indicates the presence of a limit. In big-O notation and related notations \Omega, \Theta, \gg, \ll, \asymp, there is no implied limit, in contrast with little-o, \sim and \omega notations. Notation such as f(x)=O(g(x)) \;\; (x\to\infty) can be considered an abuse of notation. Equals sign Some consider f(x)=O(g(x)) to also be an abuse of notation, since the use of the equals sign could be misleading as it suggests a symmetry that this statement does not have. As de Bruijn says, O(x)=O(x^2) is true but O(x^2)=O(x) is not. Knuth describes such statements as "one-way equalities", since if the sides could be reversed, "we could deduce ridiculous things like n=n^2 from the identities n=O(n^2) and n^2=O(n^2) . In another letter, Knuth also pointed out that For these reasons, some advocate for using set notation and write f(x) \in O(g(x)) , read as "f(x) is an element of O(g(x))", or "f(x) is in the set O(g(x))" thinking of O(g(x)) as the class of all functions h(x) such that h(x)=O(g(x)). In TeX, it is produced by simply typing 'O' inside math mode. Unlike Greek-named Bachmann–Landau notations, it needs no special symbol. However, some authors use the calligraphic variant \mathcal{O} instead. The big-O originally stands for "order of" ("Ordnung", Bachmann 1894), and is thus a Latin letter. Neither Bachmann nor Landau ever call it "Omicron". The symbol was much later on (1976) viewed by Knuth as a capital omicron, probably in reference to his definition of the symbol Omega. The digit zero should not be used. == See also ==

Source: Wikipedia ↗

tickerdossier.com tickerdossier.substack.com