The Complexity of Solomonoff Induction - Nonparametric General Reinforcement Learning

(a) Let G:= {(x, y, q) |g(y) < q}, and let F := {(x, y, q) |q < f(x)}, both of which are in∆0_n by assumption. Hence there areΣ0_n-formulas ϕG and ϕF such that

(x, y, q)∈G ⇐⇒ ϕG(x, y, q)

(x, y, q)∈F ⇐⇒ ϕF(x, y, q)

Now f(x) > g(y) if and only if ∃q. (x, y, q) ∈ G∩F, which is equivalent to the Σ0_n-forumla

∃q. ϕG(x, y, q) ∧ ϕF(x, y, q).

(b) Follows from (a).

(d) Division is discontinuous only atg(x) = 0. We show this explicitly. By assumption, for any ε >0 there is ak0 such that for all k > k0

|φ(x, k)−f(x)|< ε and |ψ(x, k)−g(x)|< ε.

We assume without loss of generality thatε <|g(x)|, sinceg(x)6= 0by assumption. φ(x, k) ψ(x, k) − f(x) g(x) = φ(x, k)g(x)−f(x)ψ(x, k) ψ(x, k)g(x) ≤ |φ(x, k)g(x)−f(x)g(x)|+|f(x)g(x)−f(x)ψ(x, k)| |ψ(x, k)g(x)| < ε|g(x)|+|f(x)|ε |ψ(x, k)g(x)| with |ψ(x, k)g(x)|=|ψ(x, k)| · |g(x)|>(|g(x)| −ε)|g(x)|, < ε· |g(x)|+|f(x)| (|g(x)| −ε)|g(x)| ε→0 −−−→0, thereforef(x)/g(x) = limk→∞φ(x, k)/ψ(x, k).

(e) Follows from the fact that the logarithm is computable.

6.2 The Complexity of Solomonoff Induction

In this section, we derive the computability results for Solomonoff’s prior as stated in Table 6.1.

Since M is lower semicomputable, Mnorm is limit computable by Lemma 6.2 (c)

and (d). When using the Solomonoff prior M (or one of its sisters Mnorm, M, or

M(xy |x)> q ⇐⇒ ∀m∃k. φ_φ(₍xy,k_x,m₎) > q ⇐⇒ ∃k∃m0∀m≥m0. φ_φ(₍xy,k_x,m₎) > q

Figure 6.1: AΠ0₂-formula and an equivalentΣ0₂-formula defining conditionalM. Here φ(x, k) denotes a computable function that lower semicomputesM(x).

to compute the conditional probability M(xy | x) = M(xy)/M(x) for finite strings x, y∈ X∗. Because M(x)>0for all finite stringsx∈ X∗, this quotient is well-defined. Theorem 6.3 (Complexity of M,Mnorm,M, and Mnorm).

(a) M(x) is lower semicomputable (b) M(xy|x) is limit computable (c) Mnorm(x) is limit computable (d) Mnorm(xy|x) is limit computable (e) M(x) is Π0₂-computable

(f ) M(xy|x) is ∆0

3-computable (g) Mnorm(x) is ∆03-computable (h) Mnorm(xy |x) is∆03-computable

Proof. (a) By Li and Vitányi (2008, Thm. 4.5.2). Intuitively, we can run all programs in parallel and get monotonely increasing lower bounds for M(x) by adding 2−|p| every time a program phas completed outputtingx.

(b) From (a) and Lemma 6.2d since M(x)>0 (see also Figure 6.1). (c) By Lemma 6.2cd, andM(x)>0.

(d) By (iii), Lemma 6.2d since Mnorm(x)≥M(x)>0.

(e) Letφbe a computable function that lower semicomputesM. SinceM is a semimea- sure, M(xy) ≥ P

zM(xyz), hence

y∈XnM(xy) is nonincreasing in n and thus M(x)> q iff∀n∃kP

y∈Xnφ(xy, k)> q. (f) From (v) and Lemma 6.2d sinceM(x)>0. (g) From (v) and Lemma 6.2d.

§6.2 The Complexity of Solomonoff Induction 107

We proceed to show that these bounds are in fact the best possible ones. If M were∆0₁-computable, then so would be the conditional semimeasureM(· | ·). Thus the M-adversarial sequencez1:∞ defined in Example 3.42 would be computable and hence

corresponds to a computable deterministic measureµ. However, we haveM(z1:t)≤2−t

by construction, so dominanceM(x)≥w(µ)µ(x) withw(µ)>0yields a contradiction witht→ ∞:

2−t≥M(z1:t)≥w(µ)µ(z1:t) =w(µ)>0

By the same argument, the normalized Solomonoff prior Mnorm cannot be ∆01-com-

putable. However, since it is a measure, Σ0₁- or Π0₁-computability would entail ∆0₁- computability.

For M and Mnorm we prove the following two lower bounds for specific universal

Turing machines.

Theorem 6.4 (M is not Limit Computable). There is a universal Turing machine U0

such that the set{(x, q)|MU0(x)> q} is not in ∆0

Proof. Assume the contrary and letA ∈Π0₂\∆0₂ and η be a quantifier-free first-order formula such that

n∈A ⇐⇒ ∀k∃i. η(n, k, i). (6.2) For each n∈_N, we define the program pn as follows.

1: procedurepn 2: output1n+10 3: k←0 4: whiletrue do 5: i←0 6: whilenotη(n, k, i) do 7: i←i+ 1 8: k←k+ 1 9: output0

Each program pn always outputs 1n+10. Furthermore, the program pn outputs the

infinite string 1n+10∞ if and only if n∈A by (6.2). We define U0 as follows using our reference machineU.

• U0(1n+10): Runpn.

• U0(00p): Run U(p).

• U0(01p): Run U(p) and bitwise invert its output.

By construction,U0is a universal Turing machine. Nopnoutputs a string starting with

0n+11, thereforeMU0(0n+11) = 1 4 MU(0n+11) +MU(1n+10) . Hence MU0(1n+10) = 2−n−21_A(n) +1 4MU(1 n+1_{0) +}1 4MU(0 n+1₁₎ = 2−n−21A(n) +MU0(0n+11)

If n /∈ A, then MU0(1n+10) = M_U0(0n+11). Otherwise, we have |M_U0(1n+10)− MU0(0n+11)|= 2−n−2.

Now we assume that MU0 is limit computable, i.e., there is a computable function φ:X∗×_N→_Qsuch thatlimk→∞φ(x, k) =MU0(x). We get that

n∈A ⇐⇒ lim

k→∞φ(0

n+1₁_{, k}₎₋_φ₍₁n+1₀_{, k}₎_≥₂−n−2_,

thus Ais limit computable, a contradiction.

Corollary 6.5 (Mnorm is not Σ02- or Π02-computable). There is a universal Turing machine U0 such that {(x, q)|MnormU0(x)> q} is not in Σ0₂ or Π0₂.

Proof. SinceMnorm=c·M, there exists a k∈Nsuch that 2−k< c (even if we do not

know the value of k). We can show that the set{(x, q)|MnormU0(x)> q}is not in∆0₂ analogously to the proof of Theorem 6.4, using

n∈A ⇐⇒ lim

k→∞φ(0

n+1₁_{, k}₎₋_φ₍₁n+1₀_{, k}₎_≥₂−k−n−2_.

If Mnorm were Σ02-computable or Π02-computable, this would imply that Mnorm is ∆02-

computable since Mnorm is a measure, a contradiction.

SinceM() = 1, we haveM(x|) =M(x), so the conditional probabilityM(xy|x) has at least the same complexity as M. Analogously for Mnorm and Mnorm since they

are measures. ForM, we have that M(x|) =Mnorm(x), so Corollary 6.5 applies. All

that remains to prove is that conditionalM is not lower semicomputable.

Theorem 6.6 (Conditional M is not Lower Semicomputable). The set {(x, xy, q) | M(xy |x)> q} is not recursively enumerable.

We gave a different, more complicated proof in Leike and Hutter (2015b). The following, much simpler and more elegant proof is due to Sterkenburg (2016, Prop. 3).

Proof. Assume to the contrary thatM(xy |x)is lower semicomputable. Leta6=b∈ X. We construct an infinite string x by defining its initial segments =: x(0) @ x(1) @ x(2) @ . . . @ x. At every step n, we enumerate strings y ∈ X∗ _{until one is found}

satisfying M(a | x(n)y) ≥ 1/2; then set x(n+ 1) := x(n)yb. This implies that for infinitely many t there is an n such that M(b | x<t) = M(b | x(n)y) ≤ 1−M(a |

x(n)y) ≤ 1/2. Since we assumed M(· | ·) to be lower semicomputable, the infinite string x is computable, and hence M(xt | x<t) → 1 by Corollary 3.55. But this

contradicts M(b|x<t)≤1/2infinitely often.

In document Nonparametric General Reinforcement Learning (Page 123-126)