Reformalism: Linked Lists and Information Processing

It was not clear to Newell and Simon from the start what would be involved in actually programming the heuristic theorem prover that they imagined. They operated in the absence of pre-made languages, compilers, libraries of functions, and associated practices that have streamlined implementation to a large extent today. Starting in 1955, Newell’s notes include a constant return to very basic questions about just how they should proceed and what was needed for the project. Newell and Simon’s early work in 1955 and 1956 indicates that two issues quickly became a primary focus in the development of Logic Theory Machine: the need for alanguage in which problems could be formulated, solved, and the results expressed, and the need formemories (i.e., ways of storing data) that would make all of the necessary information available to the program throughout its work.

For example, in January of 1955, Newell wrote the following preliminary notes that introduce these issues:

The purpose of a problem solving program is to 1. Find out and formulate what the problem is. 2. Determine what methods might solve the problem. 3. Organize the solution. Several problems here: 1. What is available to diagnose the problem? retrace program - reading symbols. Storage of special info. [...] How to express the result when you get it? [...] Need a set of working memories which provide appropriate content for talking about problem. [...] What must it be able to do? Accept a problem from outside. Here a certain language is necessary. A content language would be preferable here so we could “talk” to the machine.127

Even in this early exploration where it is not yet clear what the basic structure of the program is going to be, certain key preliminary ideas are visible. Newell identifies the

127_{CMU-AN, Series I: RAND, Notes, “Logic Theory Machine I”, Early Notes dated January 10, 1955:}

all my emphasis. The details of their language will be discussed in what follows, but I want to note here that from what I understand “content language” for Newell is something akin in spirit to what is now called a “declarative language” (Prolog is an example). That is to say, he wants a formal language with which he can convey a problem to the computer - a way of describing the problem to it - without necessarily involving an algorithm for actually solving the problem at the same time.

program in a very basic way as a “symbol reader”, for which the processing of symbols is a part of the problem-solving process. In fact, as I will discuss shortly, information

as Newell and Simon mean it (e.g., in “Complex Information System”) was constituted bysymbols.128

As I suggested in the section “The Possibilities of Computing” in the introduction to the dissertation, the stipulation that the Logic Theory Machine be able to read symbolic information was itself something of a novelty in early computing research. The mainframe computers of the 1950s were all originally built to do numerical work - to compile ballistics tables, to solve complex diﬀerential equations, to perform numerical simulations of nuclear chain reactions, and so on. Newell and Simon were among the earliest computer practitioners to imagine the computer as a more abstract agent, capable of manipulating any formal symbolic system whatever, whether its constituent symbols referred to a numerical domain or not.

However, the JOHNNIAC was built with numerical calculations in mind and the machine language with which it was usually programmed in the 1950s was designed with numerical calculations in mind. Programming the JOHNNIAC to be a symbol reader in a language whose commands and infrastructure focused on numerical calculations would have been extremely diﬃcult if it was even possible. Newell, Shaw, and Simon wanted to turn the JOHNNIAC into the Logic Theory Machine, and to do so they would have to develop a new set of tools. To transform JOHNNIAC from a numerical machine to a symbolic machine, they elected to design a new programming language for communicating with the computer, fashioning the computer’s behavior, and for

128_{Newell and Simon indicate a particular debt to some early work on the automation of chess playing}

by Oliver Selfridge and G.P. Dineen. Selfridge visited RAND in 1954 to demonstrate their work on chess playing and Newell has described their meeting as something of a revelatory experience (see Pamela McCorduck, interview with Allen Newell (1976), CMU-PM. In fact, Newell’s first attempt to specify a symbolic program was in fact a chess player and not a theorem-prover

formulating problems, results, and practices in nonnumeric domains like logic.129

Newell, Shaw, and Simon developed a new programming language - the Information Processing Language (IPL) - for propositional logic within which the Logic Theory Ma- chine would ultimately be realized. The first instantiation (originally called the “Logic Language”, and later renamed IPL I) was worked out by Newell and Simon without concern for the technical specifications of the JOHNNIAC computer. IPL I was an example of areformalism - in it, the axioms and permitted inferences of Principia were translated into what they called “Information Structures” and “Information Processes.” IPL II was the first language designed by Newell and Shaw to actually program the JOHNNIAC computer to become the Logic Theory Machine. It was a transformation of the IPL I into commands and features that would allow the actual programming of the JOHNNIAC.130

The Logic Language and IPL II are the primary focus of this section, and they are the site in which Newell, Simon, and Shaw’s reformalism of propositional logic, the creation of new ways of reading and writing logical expressions, and the changing materiality of mathematical objects can be most clearly seen. These languages were a condition of possibility and a constitutive technology for the Logic Theory Machine. With them, JOHNNIAC could be transformed into the Logic Theory Machine.

The first translation of Principia into an information processing system was exe- cuted by Newell and Simon, and presented in a relatively complete form for the first time in September of 1956.131 _{Their first order of business was to articulate and for-}

129_{A succinct reflection on these early tensions between the numerical intentions of early computing}

and the transition to symbolic tasks can be found in Herbert Gelernter, J. R. Hansen, C. L. Gerberich, “A FORTRAN-Compiled-List Processing Language” inJournal of the ACM, Vol. 7, No. 2 (1960). In the later 1950s, Gelernter lead a project at IBM research to program theorem proving in elementary plane geometry That project included an expansion and modification of the IPL language.

130_{The IPL language is best known for being the immediate predecessor to the LISP language -}

developed largely by John McCarthy and the emerging AI community in California later in the 1950s. LISP borrowed many features of IPL, including list processing which will be a focus in what follows.

Figure 1.7: Tree Diagram ofPrincipia Mathematica Proposition 1.7: ⇠p!(qvee p).

Newell, Simon, “The Logic Theory Machine,” p. 9.

malize the representation of expressions as “trees” in which logical operators - including “OR”, “NOT”, and “IMPLIES” - function as parent nodes for the variables that they relate as shown in Figure 1.7. However, in the Logic Language, each element in these expression trees would hold more information than their counterparts in Principia. Whereas the alphabet for Whitehead and Russell’s logical expressions was limited to the symbols for variables, constants, and logical operators, Newell and Simon developed a set of eight additional symbols that would specify an element in the expression. Newell and Simon’s symbols include the following:

• G: with value 1 it indicates that a variable is negated (likep is above) and with value 0 it indicates that a variable is not negated.

• V: indicates that an element in an expression-tree is a variable or not.

• F: indicates that an element is “free” for substitution - meaning, free to be replaced by definitionally or inferentially equivalent expressions.

• C: indicates that an element is a connective (W or!).

• N: is set to the name of every variable. Name here means the symbol that is used to represent that variable. In Principia 1.7, for example, q and p are the names of the two variables included.

Publication P-868 (July 12, 1956) [paper was presented at the Symposium on Information Theory,

Sponsored by the Professional Group on Information Theory of the Institute of Radio Engineers, September 10, 1956].

• P: is a symbol that keeps track of the position of a given element in the tree. The parent node for any expression (i.e., the main operator) has no P value but every other element does. The P values are shown in the diagram above for the expression-tree ofPrincipia 1.7. “P = RR” means an expression us the right child of a sub-tree that is the right child of a parent logical connective, for example. The last two symbols for specifying elements in an expression tree are most important for my purposes here because both include information about where the expression is (rather than what it is):

• A: stores the location of the whole expression-tree in memory. In “The Logic Theory Machine”, where Newell and Simon introduce these symbols, they are not yet talking explicitly about the JOHNNIAC memory.

• U: specifies whether or not an element is a “unit”. By “unit”, Newell and Simon mean that an element is held together in memory as a whole. More on this in a moment.132

In addition to these symbols, Newel and Simon also created three symbols for describing a tree as whole -H,J, andK - that specified the total number of variables in a tree, the number of distinct variables in a tree, and the number of “levels” (i.e., logical operators) in the tree.133

Those who work with traditional logical symbol systems might say that the information specified by these symbols is either implicit or visually obvious in notational systems like Whitehead and Russell’s. And this may be true for trained human practitioners. However, the JOHNNIAC was not equipped with anything like a visual faculty and it, like all computing machinery, required direct and explicit access to all necessary information for solving a problem.

JOHNNIAC could notsee that there are three variables, two of which are negated in

Principia 1.7. I propose that this transformation of information in logical expressions

132_{Newell and Simon’s articulation and definition of these symbols can be found in “The Logic Theory}

Machine”, pp. 10 - 11.

from visually perceptible and implicit for human practitioners to explicitly represented by a symbolic alphabet is nontrivial. First, Newell, Shaw, and Simon understood reasoning (for humans and for computers) assymbol manipulation - as such, the addition of new symbols to the formalism of logical work literally changes the stuﬀ of reasoning. It adds pieces to the information system that is understood to be logical thought. It also diverges from the symbolic notation system in Principia, meant to enable a particular kind of reasoning.

More than that, the addition of this new information - these new symbols - begged an organizational question: how to keep these expressions-as-trees with all of the symbols that accompany each element organized such that the computer can keep track of them, manipulate them, and ultimately prove theorems with them? One of the primary diﬃculties of implementation in the 1950s was the extremely limited memory then available. Across its various storage devices, the JOHNNIAC didn’t even have

one tenth of the memory of a floppy disk from the 1980s.

The JOHNNIAC had two diﬀerent kinds of storage, both extremely limited in capacity according to today’s standards: the faster memory was the magnetic core memory. Here, JOHNNIAC would store propositions it was actually working on - generating subproblems from, and so on. The slower and more abundant memory was magnetic drum memory where it would store other relevant information like previously proven theorems.134 _{When the JOHNNIAC first became operational early in 1953, it}

had RCA selectron tubes for memory storage. However, these were replaced with magnetic core storage commissioned from the International Telemeter Corporation later in 1953 that was upgraded in 1955. Also in 1955, around the time that Newell

134_{This dichotomy of storage remains in modern computing technologies (in the form of RAM and}

hard drives respectively). This dichotomy was also reflected even in the high level Logic Language (IPL I). Newell and Simon specified that some memory structures would be “working memories” - in the magnetic core - and “storage memories” - in the magnetic drums. See Newell, Simon “The Logic Theory Machine”, p. 11.

and Simon began work on the Logic Theory Machine, a magnetic drum storage unit was added to the JOHNNIAC for additional storage.135 _{The magnetic core o}_ﬀ_ered

4096 words of memory (each consisting of 40 bits), and the magnetic drums provided 9 216 words.136 _{In total, the JOHNNIAC included 532 280 bits (roughly equivalent to}

65 kilobytes by modern measure) in which all Newell, Shaw, and Simon’s reformalized propositional calculus - all of its protocols, instructions, objects, and processes - would take up residence.137

Given the extremely limited storage of 1950s computers, practitioners were required to address many diﬃcult “memory management” problems in the process of implementation. How to keep track of available memory? How to organize relevant data in memory? How to reclaim memory once it’s no longer in use? Questions like these were (and in fact still are) central to the work of implementation. Newell, Shaw, and Simon needed a way to store and organize all of the relevant information for logic proof by backwards subproblem chaining in JOHNIAC’s limited capacity. Their solution to this question is at the heart of what interests me in this chapter: in order to keep all of these new symbols organized and usable, they devised a new information structure called alinked list.

Each of the eleven symbols that specified an element in a logical expression (a

135_{These details are provided in Willis Ware, “Johnniac Eulogy”, RAND document P-3313 (March}

1966), pp. 7 - 8.

136_{In his “Johnniac Eulogy,” Ware indicated that the magnetic drum could store 12 000 words.}

However, Newell and Shaw reported that 9 216 words were available to Logic Theory Machine on the magnetic drums. See Ware, p. 8 and Newell, Shaw “Programming the Logic Theory Machine”, p. 7.

137_{Magnetic core memory consisted of small rings of magnetic material woven together by electrical}

wires. Electrical current running through the wires served as a mechanism for setting the direction of the magnetic field produced in those rings in one of two possible directions. Those two directions of magnetic field served as the 1s and 0s of the binary logic that constitutes all digital computation. A more in depth exploration of the history and functioning of magnetic core memory is available in Paul Ceruzzi,A History of Modern Computing (Cambridge, MA: MIT Press, 1991 [2003]), esp. pp. 49 - 53. Magnetic drum memory operated according to similar mechanisms, except that drums stored data on the magnetic drum-surface using read-write heads that would direct discrete magnetic fields. A more in depth exploration of the history and functioning of magnetic drum memory is available in Ceruzzi,A History of Modern Computing, esp. pp. 38 - 44.

Figure 1.8: Diagram of a Linked List Information Structure. Newell, Shaw, “Program- ming the Logic Theory Machine”, Rand Document P-954 (February 1957), p. 13a. variable or a logical operator) would be stored in a contiguous set of bits of Johnniac’s memory. The last of those bits would be used to store the numericaladdressof the next element in the list that could be stored anywhere else in Johnniac’s memory. Here, the elements of a logical proposition were not concatenated in sequence together but were rather distributed throughout JOHNNIAC memory, held together by a virtual chain of address pointers. In this form, they could not be registered as a whole by sight or any other mechanism, but only by the process of traversing the pointers. In their article devoted to the actual implementation of the Logic Theory Machine on the JOHNNIAC, Newell and Shaw diagrammatically represent the list structure as seen in Figure 1.8.

Linked lists solved one particular memory management problem. They enabled the JOHNNIAC to make use of whatever memory happened to be available when transforming, manipulating, or storing logical expressions. Rather than finding chunks of contiguous memory in which to write a whole logical expression, or worse, to create that contiguous memory, the JOHNNIAC could store new information anywhere that there was space available and simply point to them with the appropriate address.

Imagine, for example, that you want to keep an updated guest list for a party. Instead of finding a new sheet of paper and writing the whole list out anew with one fewer name each time someone replies in the negative, it would be far less resource-

Figure 1.9: Diagram of Deleting an Element from a Linked List. Newell, Shaw, “Pro- gramming the Logic Theory Machine,” p. 14a.

consuming to simply cross a name oﬀ the list. So it was with linked lists, except that they were not lined up like on a page: if the JOHNNIAC wanted to remove an element from a logical proposition, it simply rerouted the address pointer from the previous element to the following element, as shown in Figure 1.9 rather than rewriting the whole expression without it, after finding available memory to do so. In this transition, the elementB was removed from a logical expression and all that was required was for the elementAto edit its address indicator to point to C instead. After that, the address of

B would be added to a list of available memory that JOHNNIAC kept track of during any run of the Logic Theory Machine program. Only two readdressing operations were required rather than all of the housekeeping computational work of finding or creating enough memory to rewrite the transformed expression as a whole.

The list structure helped both with the reuse of freed memory and with the project of keeping memory structures in tact over transformation because the elements in a

given list need not be spatially together in memory. That is, the actual magnetic bits within which the elements of a list are stored do not need to be neighbors in the magnetic core or drums. This is because every unit in a list keeps track of its successor’s address no matter where it might be.138

This feature of the list structure - that lists need not be physically located together -

In document After Math: (Re)configuring Minds, Proof, and Computing in the Postwar United States (Page 91-104)