String’s Depth of Burial

string_conversion_03.2013

A string’s depth might be defined as the execution time of its minimal program.

The difficulty with this definition arises in cases where the minimal program is only a few bits smaller than some much faster program, such as a print program, to compute the same output x. In this case, slight changes in x may induce arbitrarily large changes in the run time of the minimal program, by changing which of the two competing programs is minimal. Analogous instability manifests itself in translating programs from one universal machine to another. This instability emphasizes the essential role of the quantity of buried redundancy, not as a measure of depth, but as a certifier of depth. In terms of the philosophy-of-science metaphor, an object whose minimal program is only a few bits smaller than its print program is like an observation that points to a nontrivial hypothesis, but with only a low level of statistical confidence.

To adequately characterize a finite string’s depth one must therefore consider the amount of buried redundancy as well as the depth of its burial. A string’s depth at significance level s might thus be defined as that amount of time complexity which is attested by s bits worth of buried redundancy. This characterization of depth may be formalized in several ways.

A string’s depth at significance level s be defined as the time required to compute the string by a program no more than s bits larger than the minimal program.

This definition solves the stability problem, but is unsatisfactory in the way it treats multiple programs of the same length. Intuitively, 2k distinct (n + k)-bit programs that compute same output ought to be accorded the same weight as one n-bit program; but, by the present definition, they would be given no more weight than one (n + k)-bit program.

A string’s depth at signicifcance level s depth might be defined as the time t required for the string’s time-bounded algorithmic probability Pt(x) to rise to within a factor 2−s of its asymptotic time-unbounded value P(x).

This formalizes the notion that for the string to have originated by an effective process of t steps or fewer is less plausible than for the first s tosses of a fair coin all to come up heads.

It is not known whether there exist strings that are deep, in other words, strings having no small fast programs, even though they have enough large fast programs to contribute a significant fraction of their algorithmic probability. Such strings might be called deterministically deep but probabilistically shallow, because their chance of being produced quickly in a probabilistic computation (e.g. one where the input bits of U are supplied by coin tossing) is significant compared to their chance of being produced slowly. The question of whether such strings exist is probably hard to answer because it does not relativize uniformly. Deterministic and probabilistic depths are not very different relative to a random coin-toss oracle A of the equality of random-oracle-relativized deterministic and probabilistic polynomial time complexity classes; but they can be very different relative to an oracle B deliberately designed to hide information from deterministic computations (this parallels Hunt’s proof that deterministic and probabilistic polynomial time are unequal relative to such an oracle).

(Depth of Finite Strings): Let x and w be strings and s a significance parameter. A string’s depth at significance level s, denoted Ds(x), will be defined as min{T(p) : (|p|−|p| < s)∧(U(p) = x)}, the least time required to compute it by a s-incompressible program. At any given significance level, a string will be called t-deep if its depth exceeds t, and t-shallow otherwise.

The difference between this definition and the previous one is rather subtle philosophically and not very great quantitatively. Philosophically, when each individual hypothesis for the rapid origin of x is implausible at the 2−s confidence level, then it requires only that a weighted average of all such hypotheses be implausible.

There exist constants c1 and c2 such that for any string x, if programs running in time ≤ t contribute a fraction between 2−s and 2−s+1 of the string’s total algorithmic probability, then x has depth at most t at significance level s + c1 and depth at least t at significance level s − min{H(s), H(t)} − c2.

Proof : The first part follows easily from the fact that any k-compressible self-delimiting program p is associated with a unique, k − O(1) bits shorter, program of the form “execute the result of executing p∗”. Therefore there exists a constant c1 such that if all t-fast programs for x were s + c1– compressible, the associated shorter programs would contribute more than the total algorithmic probability of x. The second part follows because, roughly, if fast programs contribute only a small fraction of the algorithmic probability of x, then the property of being a fast program for x is so unusual that no program having that property can be random. More precisely, the t-fast programs for x constitute a finite prefix set, a superset S of which can be computed by a program of size H(x) + min{H(t), H(s)} + O(1) bits. (Given x∗ and either t∗ or s∗, begin enumerating all self-delimiting programs that compute x, in order of increasing running time, and quit when either the running time exceeds t or the accumulated measure of programs so far enumerated exceeds 2−(H(x)−s)). Therefore there exists a constant c2 such that, every member of S, and thus every t-fast program for x, is compressible by at least s − min{H(s), H(t)} − O(1) bits.

The ability of universal machines to simulate one another efficiently implies a corresponding degree of machine-independence for depth: for any two efficiently universal machines of the sort considered here, there exists a constant c and a linear polynomial L such that for any t, strings whose (s+c)-significant depth is at least L(t) on one machine will have s-significant depth at least t on the other.

Depth of one string relative to another may be defined analogously, and represents the plausible time required to produce one string, x, from another, w.

(Relative Depth of Finite Strings): For any two strings w and x, the depth of x relative to w at significance level s, denoted Ds(x/w), will be defined as min{T(p, w) : (|p|−|(p/w)∗| < s)∧(U(p, w) = x)}, the least time required to compute x from w by a program that is s-incompressible relative to w.

Depth of a string relative to its length is a particularly useful notion, allowing us, as it were, to consider the triviality or nontriviality of the “content” of a string (i.e. its bit sequence), independent of its “form” (length). For example, although the infinite sequence 000… is intuitively trivial, its initial segment 0n is deep whenever n is deep. However, 0n is always shallow relative to n, as is, with high probability, a random string of length n.

In order to adequately represent the intuitive notion of stored mathematical work, it is necessary that depth obey a “slow growth” law, i.e. that fast deterministic processes be unable to transform a shallow object into a deep one, and that fast probabilistic processes be able to do so only with low probability.

(Slow Growth Law): Given any data string x and two significance parameters s2 > s1, a random program generated by coin tossing has probability less than 2−(s2−s1)+O(1) of transforming x into an excessively deep output, i.e. one whose s2-significant depth exceeds the s1-significant depth of x plus the run time of the transforming program plus O(1). More precisely, there exist positive constants c1, c2 such that for all strings x, and all pairs of significance parameters s2 > s1, the prefix set {q : Ds2(U(q, x)) > Ds1(x) + T(q, x) + c1} has measure less than 2−(s2−s1)+c2.

Proof: Let p be a s1-incompressible program which computes x in time Ds1(x), and let r be the restart prefix mentioned in the definition of the U machine. Let Q be the prefix set {q : Ds2(U(q, x)) > T(q, x) + Ds1(x) + c1}, where the constant c1 is sufficient to cover the time overhead of concatenation. For all q ∈ Q, the program rpq by definition computes some deep result U(q, x) in less time than that result’s own s2-significant depth, and so rpq must be compressible by s2 bits. The sum of the algorithmic probabilities of strings of the form rpq, where q ∈ Q, is therefore

Σq∈Q P(rpq)< Σq∈Q 2−|rpq| + s2 = 2−|r|−|p|+s2 μ(Q)

On the other hand, since the self-delimiting program p can be recovered from any string of the form rpq (by deleting r and executing the remainder pq until halting occurs, by which time exactly p will have been read), the algorithmic probability of p is at least as great (within a constant factor) as the sum of the algorithmic probabilities of the strings {rpq : q ∈ Q} considered above:

P(p) > μ(Q) · 2−|r|−|p|+s2−O(1)

Recalling the fact that minimal program size is equal within a constant factor to the −log of algorithmic probability, and the s1-incompressibility of p, we have P(p) < 2−(|p|−s1+O(1)), and therefore finally

μ(Q) < 2−(s2−s1)+O(1), which was to be demonstrated.

Weil Conjectures. Note Quote.

2

Solving Diophantine equations, that is giving integer solutions to polynomials, is often unapproachably difficult. Weil describes one strategy in a letter to his sister, the philosopher Simone Weil: Look for solutions in richer fields than the rationals, perhaps fields of rational functions over the complex numbers. But these are quite different from the integers:

We would be badly blocked if there were no bridge between the two. And voilà god carries the day against the devil: this bridge exists; it is the theory of algebraic function fields over a finite field of constants.

A solution modulo 5 to a polynomial P(X,Y,..Z) is a list of integers X,Y,..Z making the value P(X,Y,..Z) divisible by 5, or in other words equal to 0 modulo 5. For example, X2 + Y2 − 3 has no integer solutions. That is clear since X and Y would both have to be 0 or ±1, to keep their squares below 3, and no combination of those works. But it has solutions modulo 5 since, among others, 32 + 32 − 3 = 15 is divisible by 5. Solutions modulo a given prime p are easier to find than integer solutions and they amount to the same thing as solutions in the finite field of integers modulo p.

To see if a list of polynomial equations Pi(X, Y, ..Z) = 0 have a solution modulo p we need only check p different values for each variable. Even if p is impractically large, equations are more manageable modulo p. Going farther, we might look at equations modulo p, but allow some irrationals, and ask how the number of solutions grows as we allow irrationals of higher and higher degree—roots of quadratic polynomials, roots of cubic polynomials, and so on. This is looking for solutions in all finite fields, as in Weil’s letter.

The key technical points about finite fields are: For each prime number p, the field of integers modulo p form a field, written Fp. For each natural number r > 0 there is (up to isomorphism) just one field with pr elements, written as Fpr or as Fq with q = pr. This comes from Fp by adjoining the roots of a degree r polynomial. These are all the finite fields. Trivially, then, for any natural number s > 0 there is just one field with qs elements, namely Fp(r+s) which we may write Fqs. The union for all r of the Fpr is the algebraic closure Fp. By Galois theory, roots for polynomials in Fpr, are fixed points for the r-th iterate of the Frobenius morphism, that is for the map taking each x ∈ Fp to xpr.

Take any good n-dimensional algebraic space (any smooth projective variety of dimension n) defined by integer polynomials on a finite field Fq. For each s ∈ N, let Ns be the number of points defined on the extension field F(qs). Define the zeta function Z(t) as an exponential using a formal variable t:

Z(t) = exp(∑s=1Nsts/s)

The first Weil conjecture says Z(t) is a rational function:

Z(t) = P(t)/Q(t)

for integer polynomials P(t) and Q(t). This is a strong constraint on the numbers of solutions Ns. It means there are complex algebraic numbers a1 . . . ai and b1 . . . bj such that

Ns =(as1 +…+ asi) − (bs1 +…+ bsj)

And each algebraic conjugate of an a (resp. b) also an a (resp. b).

The second conjecture is a functional equation:

Z(1/qnt) = ± qnE/2tEZ(t)

This says the operation x → qn/x permutes the a’s (resp. the b’s).The third is a Riemann Hypothesis

Z(t) = (P1(t)P3(t) · · · P2n−1(t))/(P0(t)P2(t) · · · P2n(t))

where each Pk is an integer polynomial with all roots of absolute value q−k/2. That means each a has absolute value qk for some 0 ≤ k ≤ n. Each b has absolute value q(2k−1)/2 for some 0 ≤ k ≤ n.

Over it all is the conjectured link to topology. Let B0, B1, . . . B2n be the Betti numbers of the complex manifold defined by the same polynomials. That is, each Bk gives the number of k-dimensional holes or handles on the continuous space of complex number solutions to the equations. And recall an algebraically n-dimensional complex manifold is topologically 2n-dimensional. Then each Pk has degree Bk. And E is the Euler number of the manifold, the alternating sum

k=02n (−1)kBk

On its face the topology of a continuous manifold is worlds apart from arithmetic over finite fields. But the topology of this manifold tells how many a’s and b’s there are with each absolute value. This implies useful numerical approximations to the numbers of roots Ns. Special cases of these conjectures, with aspects of the topology, were proved before Weil, and he proved more. All dealt with curves (1-dimensional) or hypersurfaces (defined by a single polynomial).

Weil presented the topology as motivating the conjectures for higher dimensional varieties. He especially pointed out how the whole series of conjectures would follow quickly if we could treat the spaces of finite field solutions as topological manifolds. The topological strategy was powerfully seductive but seriously remote from existing tools. Weil’s arithmetic spaces were not even precisely defined. To all appearances they would be finite or (over the algebraic closures of the finite fields) countable and so everywhere discontinuous. Topological manifold methods could hardly apply.