) enumerates all
strings s, ordered by length; the vertical axis (
linear scale) measures Kolmogorov complexity in
bits. Most strings are incompressible, i.e. their Kolmogorov complexity exceeds their length by a constant amount. 9 compressible strings are shown in the picture, appearing as almost vertical slopes. Due to Chaitin's incompleteness theorem (1974), the output of any program computing a lower bound of the Kolmogorov complexity cannot exceed some fixed limit, which is independent of the input string
s. By the above theorem (), most strings are complex in the sense that they cannot be described in any significantly "compressed" way. However, it turns out that the fact that a specific string is complex cannot be formally proven, if the complexity of the string is above a certain threshold. The precise formalization is as follows. First, fix a particular
axiomatic system S for the
natural numbers. The axiomatic system has to be powerful enough so that, to certain assertions
A about complexity of strings, one can associate a formula
FA in
S. This association must have the following property: If
FA is provable from the axioms of
S, then the corresponding assertion
A must be true. This "formalization" can be achieved based on a
Gödel numbering.
Theorem: There exists a constant
L (which only depends on
S and on the choice of description language) such that there does not exist a string
s for which the statement can be proven within
S.
Proof Idea: The proof of this result is modeled on a self-referential construction used in
Berry's paradox. We initially obtain a program which enumerates the proofs within
S and we specify a procedure
P which takes as an input an integer
L and prints the strings
x which are within proofs within
S of the statement
K(
x) ≥
L. By then setting
L to greater than the length of this procedure
P, we have that the required length of a program to print
x as stated in
K(
x) ≥
L as being at least
L is then less than the amount
L since the string
x was printed by the procedure
P. This is a contradiction. So it is not possible for the proof system
S to prove
K(
x) ≥
L for
L arbitrarily large, in particular, for
L larger than the length of the procedure
P, (which is finite).
Proof: We can find an effective enumeration of all the formal proofs in
S by some procedure
function NthProof(
int n) which takes as input
n and outputs some proof. This function enumerates all proofs. Some of these are proofs for formulas we do not care about here, since every possible proof in the language of
S is produced for some
n. Some of these are complexity formulas of the form
K(
s) ≥
n where
s and
n are constants in the language of
S. There is a procedure
function NthProofProvesComplexityFormula(
int n) which determines whether the
nth proof actually proves a complexity formula
K(
s) ≥
L. The strings
s, and the integer
L in turn, are computable by procedure:
function StringNthProof(
int n)
function ComplexityLowerBoundNthProof(
int n) Consider the following procedure:
function GenerateProvablyComplexString(
int n)
for i = 1 to infinity:
if NthProofProvesComplexityFormula(i)
and ComplexityLowerBoundNthProof(i) ≥
n return StringNthProof(
i) Given an
n, this procedure tries every proof until it finds a string and a proof in the formal system
S of the formula
K(
s) ≥
L for some
L ≥
n; if no such proof exists, it loops forever. Finally, consider the program consisting of all these procedure definitions, and a main call: GenerateProvablyComplexString(
n0) where the constant
n0 will be determined later on. The overall program length can be expressed as
U+log2(
n0), where
U is some constant and log2(
n0) represents the length of the integer value
n0, under the reasonable assumption that it is encoded in binary digits. We will choose
n0 to be greater than the program length, that is, such that
n0 >
U+log2(
n0). This is clearly true for
n0 sufficiently large, because the left hand side grows linearly in
n0 whilst the right hand side grows logarithmically in
n0 up to the fixed constant
U. Then no proof of the form "
K(
s)≥
L" with
L≥
n0 can be obtained in
S, as can be seen by an
indirect argument: If ComplexityLowerBoundNthProof(i) could return a value ≥
n0, then the loop inside GenerateProvablyComplexString would eventually terminate, and that procedure would return a string
s such that This is a contradiction,
Q.E.D. As a consequence, the above program, with the chosen value of
n0, must loop forever. Similar ideas are used to prove the properties of
Chaitin's constant. ==Minimum message length==