Universal Persistent Turing Machine - Universal Machine Viruses

2.4 Universal Machine Viruses

2.4.3 Universal Persistent Turing Machine

In [33], Hao et al. propose to model computer viruses using the concept of a Universal Persistent Turing Machine (UPTM). We introduced Persistent Turing Machines in Section 1.3 and their universal counterpart in Section 1.5.4. The following quote from [33] illustrates the authors’ rationale for choosing UPTMs over (Universal) TMs:

Modern general-purpose computers are implemented in accordance with the idea of Universal TMs. However, as the theoretical basis for computers, Universal TMs are unable to describe a continuous computational process of a computer during which unpredictable interaction events sequentially happen.

Their definition:

Definition 2.13(UPTM viruses). LetUbe a UPTM with an encoding functionη. (The status of functionη is discussed below). Then a setV⊆Σ∗

Uis viral forUif v∈V ⇐⇒ ∃PTMM, ∃w,w0,i,o∈Σ∗_M such that ∧v=η(M) ∧(v, η(w))−η−−−(i)/→o U v, η(w 0 ) ∧w,w0∈reach(M) ∧o∈V

The above condition should be read as: “a setVis viral if each programvinV, if it runs on the universal machineU, eventually produces another programo∈V”. In essence, this captures ‘viruses’ using the same closure property on programs, as we saw in Cohen’s work (Section 2.3.2).

Let us discuss the encoding functionη. According to Hao, Yin and Zhang,ηis a one-one function that maps the transition function and strings of a machineMto the alphabetΣU of the universal machine. The

encoding functionηis defined by: η=ηmachine∪ηtape

ηmachine :M→Σ∗U

ηtape:Σ∗M→Σ

∗

The above specification ofηis particularly ambiguous. Such a functionηcould only encodeM, or perhaps a very specific (finite) set of PTM machines. In other wordsUis not universal.8 Moreover, ifηis one-one then it is injective and no output ofUwhile simulatingMwill be the encoding of a machine (because the output ofMis inΣ∗_M). In particular, this definition tells us thatVmust be empty, i.e. viruses do not exist.

It is not unreasonable to defineηas an injective function, as does Turing (see Definition 1.54). We might even feel that ‘injective’ is part of the definition of the word ‘encoding’. Suppose, however, that we have an encoding that is not injective, in which case it would in principle be possible thatη−1(gp(x)) is (the transition

function of) a PTM. Now if a simulated machineMwants to output a program, it has to output a stringt inΣ∗

Mexactly so thatη

−1_(η(_t_{)) is not uniquely defined. That means that the simulated machine needs to be} aware, in some sense, of the encoding that is employed; when a different (non-injective) encoding function is employed (and possibly an altered machineU), it probably does not output the same, if any, program.

One could argue, that this models the fact that some data can be a program (or virus) for one computer, whilst not being a program (or virus) for another. This, however, is a fallacy - the machineM can be simulated byanymachine (with appropriate encoding) that is universal. If we interpret the output of the simulation ofMin a way that is particular to a specific universal machine implementationU, the interpre- tation ofMchanges with it. This is orthogonal to the concept of universal machines, where the fact that we simulateMmeans that there is a one-one correlation between the ‘behaviour’ ofMwhen simulated and is its behaviour when running machineMitself.

We conclude that, whether the encoding function η is injective or not, we should not interpret the (encoded) output of simulated machines as programs. The proposed definition of viruses in [33] is no more plausible than that of Definition 2.8.

Recursion Machines

In this chapter we will propose a new model of computers: machines which we dub ‘Recursion Machines’. This will be a model based on the theory of recursive functions. We were inspired by the work of Bonfante, Kaczmarek and Marion in [30, 32]. These authors define viruses using a kind of recursion theory of enumerably infinite sets. What makes the work of Bonfante et al. appealing is that the domain of discourse is not the natural numbers as usual, but a rather more liberal domainD. In standard recursion theory we think of data and programs as natural numbers: both are ‘encoded’ as numbers. If the domain would be more liberal, for example the untyped lambda calculus or a full set of stringsΣ∗_{, the connection between a} program and its encoding becomes closer; they may even be identical.

Bonfante et al. provide this connection by defining ‘acceptable programming systems’ [30, 32, 39]. The properties that characterise an ‘acceptable programming system’ seem to mirror the well-known sufficient conditions of an acceptable system of indices: enumeration and parametrization [3, 6]. As proved by Rogers [15] and subsequent authors, most results of recursion theory will hold for any acceptable system of indices; notably the second recursion theorem, fixed point theorem, etc.

We feel that parts of Bonfante cum suis’ definition of ‘acceptable programming systems’ are unclear and that the connection between ‘acceptable programming systems’ and ‘acceptable system of indices’ is not convincing. We propose a definition that is more directly linked to the definition of ‘acceptable system of indices’.

On the one hand, this is a straightforward way of lifting results in the theory of computable functions on natural numbers to a theory of computable functions on arbitrary enumerably infinite sets. On the other hand, we can view the connection between the two as a ‘minimal condition’. We assume that for any reasonable programming system we have a background notion of computability or recursivity. If this background notion is not explicit, the link with acceptable systems of indices provides the minimal conditions that enable us to act as if it were explicit.

The theory of acceptable programming systems that we present can be extended to define ‘Recursion Machines’, we will do so by adding a notion of ‘sequential execution of programs’ in Section 3.2. The connection with recursive function theory also allows us to express Adleman’s theory of viral functions in the setting of Recursion Machines [17]; Adleman’s theory is the most convincing theory of viruses we have encountered. In Section 3.3.3 we show that the notion of ‘Recursion Machine’ is a fruitful addition to the theory of viral functions. The chapter is concluded by discussing the possibilities of further extensions of Recursion Machines.

3.1 Acceptable Programming Systems

We will start by providing the definition of acceptable systems of indices. On top of these we define ‘acceptable programming systems’. Throughout this chapter we will refer to the standard system of indices, using G¨odel numbering of Turing Machines, as the family of mapsφ.

Definition 3.1(Acceptable system of indices [6, II.5.1/2]). We call a system of indices any familyρof mapsρnfromωto the set ofn-ary partial recursive functions (byρnewe will indicate the partial recursive

function corresponding to the indexe). It is acceptable if, for everyn, there are total recursive functions f andgsuch that:

ρn e'φ n f(e) and φ e n'ρ n g(e)

A standard result in recursion theory is the fact that acceptable systems of indices are characterised by two properties: enumeration and parametrization [3, Exercise 2.10].

Proposition 3.2. A system of indicesψis acceptable if and only if it satisfies the following properties: • Enumeration [6, II.5.1]: for every n there is a number a such that

ψn+1

a (e,x1, . . . ,xn)'ψne(x1, . . . ,xn)

• Parametrization [6, II.5.1]: for every m,n there is a total recursive function sm

n such that ψn sm n(e,x1,...,xm)(y1, . . . ,yn)'ψ m+n e (x1, . . . ,xm,y1, . . . ,yn)

P_. A proof can be found in [6, II.5.3].

We define ‘acceptable programming systems’ with respect to a domainDon the basis of ‘acceptable systems of indices’. The setDwill have to have some structure that resembles the natural numbers. This is expressed by a bijection betweenωandDthat is in some sense ‘effectively computable’.

Definition 3.3(Acceptable programming system (1)). Aprogramming systemis a familyψof mapsψn fromDto the set ofn-ary functions onD. We say thatψisacceptableif there is an effectively computable bijectionh:ω↔ Dsuch that

if for all natural numbersn,e: ρn_e'λx1. . .xn.h−1(ψnh(e)(h(x1), . . . ,h(xn)))

thenρis an acceptable system of indices

In essence, the bijectionhgives us a one-one correspondence between all partial recursive functions (overω) and functions overD(asψindexed byD). This correspondence justifies that we lift the concept of recursivity toD.

Definition 3.4(Partial recursive function). Letψbe an acceptable programming system. Then we call a partial (or total) function f :Dn_{→ D} _{‘partial (or total) recursive’ if there is an}_d_{∈ D}_{such that} _f _'_ψn

We can also look at ‘acceptable programming systems’ this way: the background notion of effective computability expressed by a programming systemψcorresponds exactly with the standard notion of effective computability ifψis acceptable.

We shall now prove two properties of acceptable programming systems that are essential to lift results from recursion theory to acceptable programming systems. From the following theorem we can prove for example Kleene’s second recursion theorem and the existence of a composition function forD(by copying the corresponding proofs from [3] and substituting each occurrence of the natural numbers byD).

Theorem 3.5. Letψbe an acceptable programming system with respect to domainD(Definition 3.3). Then ψsatisfies the following properties:

• Enumeration property forD. For every n< ωthere is an element a∈ Dsuch that: ψn+1

a (e,x1, . . . ,xn)'ψne(x1, . . . ,xn)

• Parametrization property forD. For every m,n< ωthere is a total recursive function sm

n such that: ψn sm n(e,x1,...,xm)(y1, . . . ,yn)'ψ m+n e (x1, . . . ,xm,y1, . . . ,yn)

P. We prove the enumeration property. Definition 3.3 guarantees that we have an acceptable system

of indicesρand bijectionh : ω ↔ D. Letgbe the inverse ofh. Proposition 3.2 ensures thatρsatisfies enumeration and parametrization (forω). Essentially, we only have to go back and forth betweenDandω to prove the theorem:

ψn e(x1, . . . ,xn)'h ρn g(e)(g(x1), . . . ,g(xn)) 'hρn_a+1(g(e),g(x1), . . . ,g(xn)) 'ghψn_h+₍_a1₎(h(g(e)),h(g(x1)), . . . ,h(g(xn))) 'ψn_h+₍_a1₎(e,x1, . . . ,xn)

Sinceh(a)∈ D, the enumeration property is satisfied by the above equation.

The proof of the parametrization property is analogous. Notice that each functionsm

n that we find forD

is ‘total recursive’ by virtue of our correspondence of enumerations.

We conclude this section with another positive result.

Theorem 3.6(Existence of acceptable programming systems). LetDbe any enumerably infinite set. Then there exists an acceptable programming systemψfor this domain.

P. Since D is enumerable, there is an effectively computable bijection h : ω ↔ D. Letgbe the

inverse ofh. Letφbe the standard system of indices. For any positive natural numberndefine mapping ψn e'λx1. . .xn.h φn g(e)(g(x1), . . . ,g(xn))

. Thenhis the required bijection of Definition 3.3.

In document MoL 2008 05: Modeling Computer Viruses (Page 54-59)