fllnt\An
LeCTURes
on
(Ot\PUTATIOn
RICHARD
P.
Feynt\An
CDlTeD f>Y
AnTHon, JO. Hey •
RO~ln
W.
ALLen
Department of Electronics and Computer Science University of Southampton En land -""-":'~'''~'''''''''"'''r ... "\ ... ,,,~. '~~·'·I
g
M
a
x··
PI::; rl'~ I" .. I 1"1 " + i j : ! I'EJnn
:,i1:'~!:;'::gi,,<"
iE
U' ,! \ ~, '.' {' ( • - ... , 1 • ,I,.j .1., I _ , Bibllothekg
;=r
-:,~.-~...,~-~""""",,.,,.
Addison-Wesley Publishing Company, Inc. The Advanced Book ProBram
Reading, Massachusetts Menlo Park, California New York
Don Mills, Ontario Harlow, England Amsterdam Bonn
Sydney Singapore Tokyo Madrid San Juan
was aware of a trademark claim, the designations have been printed in initial capital letters.
Library of Congress Cataloging-In-Publication Data Feynman, Richard Phillips.
Feynman lectures on computation I Richard P. Feynman ; edited by Anthony J.G. Hey and Robin W. Allen.
p. cm.
Includes bibliographical references and index. ISBN 0-201-48991-0
1. Electronic data processing. I. Hey, Anthony J.G. Robin W. III. Title.
QA 76.F45. 1996 004' .OI-dc20
n.
Allen, 96-25127CIP
Copyright © 1996 by Carl R Feynman and Michelle Feynman
Foreword and Afterword copyright © 1996 by Anthony I.G. Hey and Robin W. Allen All rights reserved. No part of this publication may be reproduced, stored in a retrieval sys-tem, or transmitted. in any fonn or by any means, electronic, mechanical, photocopying, recording. or otherwise, without the prior written permission of the publisher. Printed in the United States of America.
Jacket design by Lynne Reed
Text design and typesetting by Tony Hey 23 4 5 6 7 8 9 lO·MA·OI00999897 "4Mn..l Print;n" TanulllV 1997
Foreword viii
Preface (Richard Feynman) xiii
1 Introduction to Computers
2
3
1.1 The File Clerk Model 1.2 Instruction Sets 1.3 Summary
Computer Organization
2.1 Gates and Combinational Logic 2.2 The Binary Decoder
2.3 More on Gates: Reversible Gates 2.4 Complete Sets of Opemtors 2.5 Flip-Flops and Computer Memory 2.6 Timing and Shift Registers
The Theory of Computation
5 8 17 20 20 30 34 39 42
46
52 3.1 Effective Procedures and Computability 523.2 Finite State Machines 55
3.3 The Limitations of Finite State Machines 60
3.4 Turing Machines 66
3.5 More on Turing Machines 75
3.6 Universal Turing Machines and the Halting Problem 80
4 Coding and Information Theory . 94
4.1 Computing and Communication Theory 95
4.2 Error Detecting and Correcting Codes 95
4.3 Shannon's Theorem 106
4.4 The Geometry of Message Space 110
4.5 Data Compression and Information 115
4.6 Information Theory 120
A.7
Further Coding Techniques 1234.8 Analogue Signal Transmission 129
5. Reversible Computation and the
Thermodynamics of Computing 137
5.1 The Physics of Information 137
5.2 Reversible Computation and the
Thermodynamics of Computing 151
5.3 Computation: Energy Cost versus Speed 167
5.4 The General Reversible Computer 172
5.5 The Billiard Ball Computer 176
5.6 Quantum Computation 182
6 Quantum Mechanical Computers 185 (Reprinted from Optics News, February 1985)
6.1 Introduction 185
6.2 Computation with a Reversible Machine 187
6.3 A Quantum Mechanipal Computer 191
6.4 Imperfections and Irreversible Free Energy Loss 199
6.5 Simplifying the Implementation 202
6.6 Conclusions 210
7 Physical Aspects of Computation
A Caveat from the Editors
7.1 The Physics of Semiconductor Devices 7.2 Energy Use and Heat Loss in Computers 7.3 VLSI Circuit Construction
7.4 Further Limitations on Machine Design
Afterword: Memories of Richard Feynman
Suggested Reading Index 212 212 213
238
257 274284
294
297Since it is now some eight years since Feynman died I feel it necessary to explain the genesis of these 'Feynman Lectures on Computation'. In November 1987 I received a call from Helen Tuck, Feynman's secretary of many years, saying that Feynman wanted me to write up his lecture notes on computation for publication. Sixteen years earlier, as a post-doc at CalTech I had declined the opportunity to edit his 'Parton' lectures on the grounds that it would be a distraction from my research. I had often regretted this decision so 1 did not take much persuading to give it a try this time around. At CalTech that first time, I was a particle physicist, but ten years later, on a sabbatical visit to CalTech in 1981, I became interested in computational physics problems playing with variational approaches that (I later found out) were similar to techniques Feynman had used many years before. The stimulus of a CalTech colloquium on 'The Future of VLSI' by Carver Mead then began my move towards parallel computing and computer science.
Feynman had an interest in computing for many years, dating back to the Manhattan project and the modeling of the plutonium implosion bomb. In 'Los Alamos from Below', published in 'Surely You're Joking, Mr. Feynman!', Feynman recounts how he was put in charge of the 'IBM group' to calculate the energy release during implosion. Even in those days before the advent of the digital computer, Feynman and his team worked out ways to do bomb calculations in parallel. The official record at CalTech lists Feynman as joining with John Hopfield and Carver Mead in 1981 to give an interdisciplinary course entitled 'The Physics of Computation'. The course was given for two years and John Hopfield remembers that aU three of them never managed to give the course together in the same year: one year Feynman was ill, and the second year Mead was on leave. A handout from the course of 1982/3 reveals the flavor of the course: a basic primer on computation, computability and information theory followed by a section entitled 'Limits on computation arising in the physical world and "fundamental" limits on computation'. The lectures that year were given by Feynman and Hopfield with guest lectures from experts such as Marvin Minsky, John Cocke and Charles Bennett. In the spring of 1983, through his connection with MIT and his son Carl, Feynman worked as a consultant for Danny Hillis at Thinking Machines, an ambitious, new parallel computer company.
In the fall of 1983, Feynman first gave a course on computing by himself, . listed in the CalTech record as being called 'Potentialities and Limitations of
Computing Machines'. In the years 1984/85 and 1985/86, the lectures were taped and it is from these tapes and Feynman's notebooks that these lecture notes have been reconstructed. In reply to Helen Tuck, I told her I was visiting CalTech in January of 1988 to talk at the 'Hypercube Conference'. This was a parallel computing conference that originated from the pioneering work at CalTech by Geoffrey Fox and Chuck Seitz on their 'Cosmic Cube' parallel computer. I talked with Feynman in January and he was very keen that his lectures on computation should see the light of day. I agreed to take on the project and returned to Southampton with an agreement td keep in touch. Alas, Feynman died not long after this meeting and we had no· chance for a more detailed dialogue about the proposed content of his published lectures.
Helen Tuck had forwarded to me both a copy of the tapes and a copy of Feynman's notes for the course. It proved to be a lot of work to put his lectures in a form suitable for publication. Like the earlier course with Hopfield and Mead, there were several guest lecturers giving one or more lectures on topics ranging from the programming language 'Scheme' to physics applications on the 'Cosmic Cube'. I also discovered that several people had attempted the task before me! However, the basic core of Feynman's contribution to the course rapidly became clear - an introductory section on computers, followed by five sections exploring the limitations of computers arising from the structure of logic gates, from mathematical logic, from the unreliability of their components, from the thermodynamics of computing and from the physics of semiconductor technology. In a sixth section, Feynman discussed the limitations of computers due to quantum mechanics. His analysis of quantum mechanical computers was presented at a meeting in Anaheim in June of 1984 and subsequently published in the journal 'Optics News' in February 1985. These sections were followed by lectures by invited speakers on a wide range of 'advanced applications' of computers - robotics, AI, vision, parallel architectures and many other topics which varied from year to year.
As advertised, Feynman's lecture course set out to explore the limitations and potentialities of computers. Although the lectures were given some ten years ago, much of the material is relatively 'timeless' and represents a Feynmanesque overview of some standard topics in computer science. Taken as a whole, however, the course is unusual and genuinely interdisciplinary. Besides giving the 'Feynman treatment' to subjects such as computability, Turing machines (or as Feynman says, 'Mr. Turing's machines'), Shannon's theorem and information theory, Feynman also discusses reversible computation, thermodynamics and quantum computation. Such a wide~mnging discussion of the fundamental basis of computers is undoubtedly unique and a 'sideways', Feynman-type view of the
whole of computing. This does not mean to say that all aspects of computing are discussed in these lectures and there are many omissions, programming languages and operating systems, to name but two. Nevertheless, the lectures do represent a summary of our knowledge of the truly fundamental limitations of digital computers. Feynman was not a professional computer scientist and he covers a large amount of material very rapidly, emphasizing the essentials rather than exploring details. Nevertheless, his approach to the subject is resolutely practical and this is underlined in his treatment of computability theory by his decision to approach the subject via a discussion of Turing machines. Feynman takes obvious pleasure in explaining how something apparently so simple as Ii
Turing machine can arrive at such momentous conclusions. His philosophy of learning and discovery also comes through strongly in these lectures. Feynman constantly emphasizes the importance of working things out for yourself, trying things out and playing around before looking in the book to see how the 'experts' have done things. The lectures provide a unique insight into Feynman's way of working.
I have used editorial license here and there in ways I should now explain. In some places there are footnotes labeled 'RPF' which are asides thatFeynman gave in the lecture that in a text are best relegated to a footnote. Other footnotes are labeled 'Editors', referring to comments inserted by me and my co-editor Robin Allen. I have also changed Feynman's notation in a few places to conform to current practice, for example, in his representation of MOS transistors.
Feynman did not learn subjects in a conventional way. Typically, a colleague would tell him somethi~g that interested him and he would go off and work out the details for himself. Sometimes, by this process of working things out for himself, Feynman was able to shed new light on a subject. His analysis of quantum computation is a case in point but it also illustrates the drawback of this method for others. In the paper on quantum computation there is a footnote after the references that is typically Feynman. It says: 'I would like to thank T. Toffoli for his help with the references'. With his unique insight and clarity of thinking Feynman was often able not only to make some real progress but also to clarify the basis of the whole problem. As a result Feynman's paper on quantum computation is widely quoted to the exclusion of other lesser mortals who had made important contributions along the way. In this case, Charles Bennett is referred to frequently, since Feynman first heard about the problem from Bennett, but other pioneers such as Rolf Landauer and Paul Benioff are omitted. Since 1 firmly believe that Feynman had no wish to take credit from
and refer the reader, in a footnote, to more complete histories of the subject. The plain truth was that Feynman was not interested in the history of a subject but only the actual problem to be solved!
I have exercised my editorial prerogative in one other way, namely in omitting a few lectures on topics that had become dated or superseded since the mid 1980's. However, in order to give a more accurate impression of the course, there will be a companion volume to these lectures which contains articles on 'advanced topics' written by the self-same 'experts' who participated in these courses at CalTech. This complementary volume will address the advances made. over the past ten years and will provide a fitting memorial to Feynman' s explorations of computers.
There are many acknowledgements necessary in the successful completion of a project such as this. Not least I should thank Sandy Frey and Eric Mjolness, who both tried to bring some order to these notes before me. I am grateful to Geoffrey Fox, for trying to track down students who had taken the courses, and to Rod van Meter and Takako Matoba for sending copies of their notes. I would also like to thank Gerry Sussman, and to place on record my gmtitude to the late Jan van de Sneepscheut, for their initial encouragement to me to undertake this task. ,Gerry had been at CalTech, on leave from MIT, when Feynman decided to go it alone, and he assisted Feynman in planning the course.
I have tried to ensure that all errors of (my) understanding have been eliminated from the final version of these lectures. In this task I have been helped by many individuals. Rolf Landauer kindly read and improved Chapter 5 on reversible computation and thermodynamics and guided me patiently through the history of the subject. Steve Furber, designer of the ARM RISC processor and now a professor at the University of Manchester, read and commented in detail on Chapter 7 on VLSI a topic of which I have little first-hand knowledge. Several colleagues of mine at Southampton also helped me greatly with the text: Adrian Pickering and Ed Zaluska on Chapters 1 and 2; Andy Gravell on Chapter 3; Lajos Hanzo on Chapter 4; Chris Anthony on Chapter 5; and Peter Ashburn, John Hamel, Greg Parker and Ed Zaluska on Chapter 7. David Barron, Nick Barron and Mike Quinn, at Southampton, -and Tom Knight at MIT, were kind enough to read through the entire manuscript and, thanks to their comments, many errors and obscurities have been removed. Needless to say, I take full responsibility for any remaining errors or confusions! I must also thank Bob Churchhouse of Cardiff University for information on Baconian ciphers, Bob Nesbitt of Southampton University for enlightening me about the geologist William Smith, and James Davenport of Bath University for
help on references pertaining to the algorithmic solution of integrals. I am also grateful to the Optical Society of America for pennission to reproduce, in slightly modified form. Feynman's classic 1985 'Optics News' paper on Quantum Mechanical Computing as Chapter 6 of these lectures.
After Feynman died, I was greatly assisted by his wife Gweneth and a Feynman family friend, Dudley Wright, who supported me in several ways, not least by helping pay for the lecture tapes to be transcribed. I must also pay tribute to my co-editor, Robin Allen, who helped me restart the project after the long legal wrangling about ownership of the Feynman archive had been decided, and without whom this project would never have seen the light of day. Gratitude is also due to Michelle Feynman, and to Carl Feynman and his wife Paula, who have constantly supported this project through the long years of legal stalemate and who have offered me every help. A word of thanks is due to Allan Wylde, then Director of the Advanced Book Program at Addison-Wesley, who showed great faith in the project in its early stages. Latterly. Jeff Robbins and Heather Mimnaugh at Addison-Wesley Advanced Books have shown exemplary patience with the inevitable delays and my irritating persistence with seemingly unimportant details. Lastly, I must record my gratitude to Helen Tuck for her faith in me and her conviction that I would finish the job - a belief I have not always shared! I hope sQe likes the result.
Tony Hey
Electronics and Computer Science Department University of Southampton England May 1996 j j j j j j j j j j j j j j j j j j j j
When I produced the Lectures on Physics, some thirty years ago now, I saw them as an aid to students who were intending to go into physics. I also lamented the difficulties of cramming several hundred years' worth of science into just three volumes. With these Lectures on Computation, matters are somewhat easier, but only just. Firstly, the lectures are not aimed solely at students in computer science, which liberates me from the shackles of exam syllabuses and allows me to cover areas of the subject for no more reason than that they are interesting. Secondly, computer science is not as old as physics; it lags by a couple of hundred years. However, this does not mean that there is significantly less on the computer scientist's plate than on the physicist's: younger it may be, but it has had a far more intense upbringing! So there is still plenty for us to cover.
Computer science also differs from physics in that it is not actually a science. It does not study natural objects. Neither is it, as you might think, mathematics; although it does use mathematical reasoning pretty extensively. Rather, computer science is like engineering it is all about getting something to do something, rather than just dealing with abstractions as in pre-Smith geology!. Today in computer science we also need to "go down into the mines" - later we can generalize. It does no harm to look at details first.
But this is not to say that computer science is all practical, down to eartt bridge-building. Far from it. Computer science touches on a variety of deer issues. It has illuminated the nature of language, which we thought WI
understood: eady attempts at machine translation failed because the old fashioned notions about grammar failed to capture all the essentials of language
It naturally encourages us to ask questions about the limits of computabilit) about what we can and cannot know about the world around us. Compute science people spend a lot of their time talking about whether or not man ; merely a machine, whether his brain is just a powerful computer that might or day be copied; and the field of 'artificial intelligence' - I prefer the ter 'advanced applications' - might have a lot to say about the nature of 're8
I William Smith was the father of modern geology; in his work as a canal and mining engineer observed the systematic layering of the rocks, and recognized the significance of fossils as a me: of determining the age of the strata in which they occur. Thus was he led to formulate
• " --_-lK~;nl .. in which rocks are successively laid down upon older layers, Prior to Smif
intelligence, and mind. Of course, we might get useful ideas from studying how the brain works, but we must remember that automobiles do not have legs like cheetahs nor do airplanes flap their wings! We do not need to study the neurologic minutiae of living things to produce useful technologies; but even wrong theories may help in designing machines. Anyway. you can see that computer science has more than just technical interest.
These lectures are about what we can and can't do with machines today, and why. I have attempted to deliver them in a spirit that should be recommended to all students embarking on the writing of their PhD theses: imagine that you are explaining your ideas to your former smart,but ignorant, self, at the beginning of your studies! In very broad outline, after a brief introduction to some of the fundamental ideas, the next five chapters explore the limitations of computers - from logic gates to quantum mechanics! The second part consists of lectures by invited experts on what I've called advanced applications - vision, robots, expert systems, chess machines and so on2•
2 A companion volume to these lectures is in preparation. As far as is possible, this second
volume will contain articles on 'advanced applications' by the same experts who contributed to Feynman's course but updated to reflect the present state of the art. [Editors)
INTRODUCTION TO COMPUTERS
Computers can do lots of things. They can add millions of numbers in the twinkling of an eye. They can outwit chess grandmasters. They can guide weapons to their targets. They can book you onto a plane between a guitar-strumming nun and a non-smoking physics professor. Some can even play the bongoes. That's quite a variety! So if we're going to talk about computers, we'd better decide right now which of them we're going to look at, and how.
In fact, we're not going to spend much of our time looking at individual machines. The reason for this is that once you get down to the guts of computers you find that, like people, they tend to be more or less alike. They can differ in their functions, and in the nature of their inputs and outputs - one can produce music, another a picture, while one can be set running from a keyboard, another by the torque from the wheels of an automobile - but at heart they are very similar. We will hence dwell only on their innards. Furthermore, we will not assume anything about their specific Input/Output (110) structure, about how information gets into and out of the machine; all we care is that, however the input gets in, it is in digital form, and whatever. happens to the output, the last the innards see of it, it's digital too; by digital, I mean binary numbers: l's and O's.
What does the inside of a computer look like? Crudely, it will be built out of a set of simple, basic elements. These elements are nothing special - they could be control valves, for example, or beads on an abacus wire - and there are many possible choices for the basic set. All that matters is that they can be used to build everything we want. How are they arranged? Again, there will be many possible choices; the relevant structure is likely to be determined by considerations such as speed, energy dissipation, aesthetics and what have you. Viewed this way, the variety in computers is a bit like the variety in houses: a Beverly Hills condo might seem entirely different from a garage in Yonkers, but both are built from the same things -,- bricks, mortar, wood, sweat only the condo has more of them, and arranged differently according to the needs of the owners. At heart they are very similar.
Let us get a little abstract for the moment and ask: how do you connect
up
which
set of elements to do themost
things? It's a deep question. The answer again is that, up to a point, it doesn't matter. Once you have a computer that canintelligence, and mind. Of course, we might get useful ideas from studying how the brain works, but we must remember that automobiles do not have legs like cheetahs nor do airplanes flap their wings! We do not need to study the neurologic minutiae of living things to produce useful technologies; but even wrong theories may help in designing machines. Anyway, you can see that computer science has more than just technical interest.
These lectures are about what we can and can't do with machines today, and why. I have attempted to deliver them in a spirit that should .be recommended to all students embarking on the writing of their PhD theses: imagine that you are explaining your ideas to your former smart, but ignorant, self, at the beginning of your studies! In very broad outline, after a brief introduction to some of the fundamental ideas, the next five chapters explore the limitations of computers from logic gates to quantum mechanics! The second part consists of lectures by invited experts on what I've called advanced applications - vision, robots, expert systems, chess machines and so on2
•
2 A companion volume to these lectures is in preparation. As far as is pOSSible, this second volume will contain articles on 'advanced applications' by the same experts who contributed to Feynman's course but updated to reflect the present state of the art. [Editors)
INTRODUCTION TO COMPUTERS
Computers can do lots of things. They can add millions of numbers in the twinkling of an eye. They can outwit chess grandmasters. They can guide weapons to their targets. They can book you onto a plane between a guitar-strumming nun and a non-smoking physics professor. Some can even play the bongoes. That's quite a variety! So if we're going to talk about computers, we'd better decide right now which of them we're going to look at, and how.
In fact, we're not going to spend much of our time looking at individual machines. The reason for this is that once you get down to the guts of computers you find that, like people, they tend to be more or less alike. They can differ in their functions, and in the nature of their inputs and outputs - one can produce music, another a picture, while one can be set running from a keyboard, another by the torque from the wheels of an automobile - but at heart they are very similar. We will hence dwell only on their innards. Furthermore, we will not assume anything about their specific Input/Output (110) structure, about how information gets into and out of the machine; all we care is that, however the input gets in, it is in digital form, and whatever happens to the output, the last the innards see of it, it's digital too; by digital, I mean binary numbers: 1 's and D's.
What does the inside of a computer look like? Crudely, it will be built out of a set of simple, basic elements. These elements are nothing special they could be control valves, for example, or beads on anabacus wire - and there are many possible choices for the basic set. All that matters is that they can be used to build everything we want. How are they arranged? Again, there will be many possible choices; the relevant structure is likely to be determined by considerations such as speed, energy dissipation, aesthetics and what have you. Viewed this way, the variety in computers is a bit like the variety in houses: a Beverly Hills condo might seem entirely different from a garage in Yonkers, but both are built from the same things....,. bricks, mortar, wood, sweat - only the condo has more of them, and arranged differently according to the needs of the owners. At heart they are very similar.
Let us get a little abstract for the moment and ask: how do you connect
up which set of elements to do the most things? It's a deep question. The answer
do a few things - strictly speaking, one that has a certain "sufficient set" of basic procedures - it can do basically anything any other computer can do. This, loosely, is the basis of the great principle of "Universality". Whoa! You cry. My pocket calculator can't simulate the red spot on Jupiter like a bank of Cray supercomputers! Well, yes it can: it would need rewiring, and we would need to soup up its memory, and it would be damned slow, but if it had long enough it could reproduce anything the Crays do. Generally, suppose we have two computers A and B, and we know all about A the way it works,its "state transition rules" and what-not. Assume that machine B is capable of merely
describing the state of A. We can then use B to simulate the running of A by
describing its successive transitions; B will, in other words, be mimicking A. It could take an eternity to do this if B is very crude and A very sophisticated, but
B will be able to do whatever A can, eventually. We will prove this later in the course by designing such a B computer, known as a Turing machine.
Let us look at universality another way. Language provides a useful source of analogy. Let me ask you this: which is the best language for describing something? Say: a four-wheeled gas-driven vehicle. Of course, most languages, at least in the West, have a simple word for this; we have "automobile", the English say "car", the French "voiture", and so on. However, there will be some languages which have not evolved a word for "automobile", and speakers of such tongues would have to invent some, possibly long and complex, description for what they see, in terms of their basic linguistic elements. Yet none of these descriptions is inheren,tly "better" than any of the others: they all do their job, and will only differ in efficiency. We needn't introduce democracy just at the level of words. We can go down to the level of alphabets. What, for example, is the best alphabet for English? That is, why stick with our usual 26 letters? Everything we can do with these, we can do with three symbols the Morse code, dot, dash and space; or two a Baconian Cipher, with A through Z represented by five-digit binary numbers. So we see that we can choose our basic set of elements with a lot of freedom, and all this choice really affects is the efficiency of our language, and hence the sizes of our books: there is no "best" language or alphabet - each is logically universal, and each can model any other. Going back to computing, universality in fact states that the set of complex tasks that can be performed using a IIsufficient" set of
basic procedures is independent of the specific, detailed structure of the basic set.
For today's computers to perform a complex task, we need a precise and complete description of how to do that task in terms of a sequence of simple basic procedures the "software" and we need a machine to carry out these
procedures in a specifiable order - this is the "hardware". This instructing has to be exact and unambiguous. In life, of course, we never tell each other exactly what we want to say; we never need to, as context, body language, familiarity with the speaker, and so on, enable us to "fill in the gaps" and resolve any ambiguities in what is said. Computers, however, can't yet "catch on" to what is being said, the way a person does. They need to be told in excruciating detail exactly what to do. Perhaps one day we will have machines that can cope with approximate task descriptions, but in the meantime we have to be very prissy about how we tell computers to do things.
Let us examine how we might build complex instructions from a set of rudimentary elements. Obviously, if an instruction set B (say) is very simple, then a complex process is going to take an awful lot of description, and the resulting "programs" will be very long and complicated. We may, for instance, want our computer to carry out all manner of numerical calculations, but find ourselves with a set B which doesn't include multiplication as a distinct operation. If we tell our machine to multiply 3 by 35, it says "what?" But suppose B does have addition; if you think about it, you'll see that we can get it to multiply by adding lots of times - in this case, add 35 to itself twice. However, it will clearly clarify the writing of B-programs if we augment the set
B with a separate "multiply" instruction, defined by the chunk of basic B instructions that go to make up multiplication. Then when we want to multiply two numbers, we say "computer, 3 times 35", and it now recognizes the word "times" - it is just a lot of adding, which it goes off and does. The machine breaks these compound instructions down into their basic components, saving us from getting bogged down in low level concepts all the time. Complex procedures are thus built up stage by stage. A very similar process takes place in everyday life; one replaces with one word a set of ideas and the connections between them. In referring to these ideas and their interconnections we can then use just a single word, and avoid having to go back and work through all the lower level concepts. Computers are such complicated objects that simplifying ideas like this are usually necessary, and good design is essential if you want to avoid getting completely lost in details.
We shall begin by constructing a set of primitive procedures, and examine how to perform operations such as adding two numbers or transferring two numbers from one memory store to another. We will then go up a level, to the next order of complexity, and use these instructions to produce operations like multiply and so on. We shall not go very far in this hierarchy. If you want to see how far you can go, the article on Operating Systems by PJ. Denning and
R.L. Brown (Scientific American, September 1984, pp. 96-104) identifies thirteen levels! This goes from levell, that of electronic circuitry registers, gates, buses - to number 13, the Operating System Shell, which manipulates the user programming environment. By a hierarchical compounding of instructions, basic transfers of 1 's and O's on level one are transformed, by the time we get to thirteen, into commands to land aircraft in a simulation or check whether a forty digit number is prime. We will jump into this hierarchy at a fairly low level, but one from which we can go up or down.
Also, our discussion will be restricted to computers with the so-called "Von Neumann architecture". Don't be put off by the word "architecture"; it's just a big word for how we arrange things, only we're arranging electronic components rather than bricks and columns. Von Neumann was a famous mathematician who, besides making important contributions to the foundations of quantum mechanics, also was the first to set out clearly the basic principles of modem computers'. We will also have occasion to examine the behavior of several computers working on the same problem, and when we do, we will restrict ourselves to computers that work in sequence, rather than in parallel; that is, ones that take turns to solve parts of a problem rather than work simultaneously. All we would lose by the omission of "parallel processing" is
speed, nothing fundamental. .
We talked earlier about computer science not being a real science. Now we have to disown the word "computer" toot You see, "computer" makes us think of arithmetic - add, subtract, multiply, and so on and it's easy to assume that this is all a computer does. In fact, conventional computers typically have one place where they do their basic math, and the rest of the machine is for the computer's main task, which is shuffling bits of paper around - only in this case the paper notes are digital electrical ·signals. In many ways, a computer is reminiscent of a bureaucracy of file clerks, dashing back and forth to their filing cabinets, taking files out and putting them back, scribbling on bits of paper, passing notes to one another, and so on; and this metaphor, of a clerk shuffling paper around in an office, will be a good place to start to get some of the basic ideas of computer structure across. We will go into this in some detail, and the impatient among you might think too much detail, but it is a perfect model for communicating the essentials of what a computer does, and is hence worth spending some time on.
'Actually, there is currently a lot of interest in designing "non-Von Neumann" machines. These
1.1: The File Clerk Model
Let's suppose we have a big company, employing a lot of salesmen. An awful lot of information about these salesmen is stored in a big filing system somewhere, and this is all administered by a clerk. We begin with the idea that the clerk knows how to get the information out of the filing system. The data is stored on cards, and each card has the name of the salesman, his location, the humber and type of sales he has made, his salary, and so on and so forth.
Salesman:
Sales:
Salary:
Location:
Now suppose we are after the answer to a specific question: "What are the total sales in California 1" Pretty dull and simple, and that's why I chose it: you must start with simple questions in order to understand difficult ones later. So how does our file clerk find the total sales in California? Here's one way he could do it:
, Take out a card
If the "location" says California, then
Add the number under "sales" to a running count called "total"
Put "sales" card back Take next card and repeat.
Obviously you have to keep this up until you've gone through all the cards. Now let's suppose we've been unfortunate enough to hire particularly stupid clerks, who can read, but for whom the above instructions assume too much: say, they don't know how to keep a running count. We need to help them a little bit more. Let us invent a "total" card for our clerk to use. He will use this to keep a running total in the following way:
Take out next "sales" card
If California, then
Take out "total" card
Add sales number to number on card Put "total" card back
Put "sales" card back
Take out next "sales" card and repeat.
This is a very mechanical rendering of how a crude computer could solve this adding problem. Obviously, the data would not be stored on cards, and the machine wouldn't have to "take out a card" - it would read the stored information from a register. It could also write from a register to a "card" without physically putting something back.
Now we're going to stretch our clerk! Let's assume that each salesman receives not only a basic salary from the company, but also gets a little on commission from sales. To find out how much, we multiply his sales by the appropriate percentage. We want our clerk to allow for this. Now he is cheap and fast, but unfortunately too dumb to multiplyl. If we tell him to multiply 5 by 7 he says "what?" So we have to teach him to multiply. To do this, we will exploit the fact that there is one thing he.does well: he can. get cards very, very quickly.
We'll work in base two. As you all probably know. the rules for binary arithmetic are easier than those for base ten; the multiplication table is so small
it will easily fit on one card. We will assume that even OUr clerk can remember these; all he needs are "shift" and "carry" operations, as the following example makes clear: In decimal: In binary: . 22 x 5
=
110 10110 101 10110 10110 (shift twice) 1101110 In decimal: 22.2
2 As an aside, although our dense file clerk is assumed in these examples to be a man, no sexist impli.cations are intended! [RPF]
So as long as our clerk can shift and carry he can, in effect, multiply. He does it very stupidly, but he also does it very quickly, and that's the point of all this: the inside of a computer is as dumb as hell but it goes like mad! It can perform very many millions of simple operations a second and is just like a very fast dumb file clerk. It is only because it is able to do things so fast that we do not notice that it is doing things very stupidly. (Interestingly enough, neurons in the brain characteristically take milliseconds to perform elementary operations, which leaves us with the puzzle of why is the brain so smart? Computers may be able to leave brains standing when it comes to mUltiplication, but they have . trouble with things even small children find simple, like recognizing people or
manipulating objects.)
To go further, we need to specify more precisely our basic set of operations. One of the most elementary is the business of transferring information from the cards our clerk reads to some sort of scratch pad on which he can do his arithmetic:
Transfer operations
"Take Card X"
=
Information on card X written to pad "Replace Card Y"=
Information on pad written on card YAll we have done is to define the instruction "take card X" to mean copying the information on card X onto the pad, and similarly with "replace card Y". Next, we want to be able to instruct the clerk to check if the location on card X was "California". He has to do this for each card, so the first thing he has to do is be able to remember "California" from one card to the next. One way to help him do this is to have California written on yet another card C so that his instructions are now:
Take card X (from store to pad) Take card C (from store to pad)
Compare what is on card X with what is on card C.
We then tell him that if the contents match, do so and so, and if they don't, put the cards back and take the next ones. Keeping on taking out and putting back the California card seems to be a bit inefficient, and indeed, you don't have to do that; you can keep it on the pad for a while instead. This would be better, but it all depends on how muchroom the clerk has on his pad and how many pieces of information he needs to keep. If there isn't much room, then there will have
to be a lot of shuffling cards back in and out. We have to worry about such things!
We can keep on breaking the clerk's complex tasks down into simpler, more fundamental ones. How, for example, do we get him to look at the "location" part of a card from the store? One way would be to burden the poor guy with yet another card, on which is written something like this:
0000 0000 0000 0000 0000 1111 0000 0000 0000 0000 ...
Each sequence of digits is associated with a particular piece of information on the card: the first set of zeroes is "lined up" with the salesman's name, the next with his age, say, and so on. The clerk zips through this numeric list until he hits a set of l' s, and then reads the information next to them. In our case, the 1111 is lined up with California. This sort of location procedure is actually used in computers, where you might use a so-called "bitwise AND" operation (we'll discuss this later). This little diversion was just to impress upon you the fact that we need not take any of
our
clerk's skills for granted - we can get him to do things increasingly stupidly.1.2: Instruction
sets
Let's take a look at the clerk's scratch pad. We haven't yet taught the clerk how to use this, so we'll do that now. We will assume that we can break down the instructions he can carry out into two groups. Firstly, there is a core "instruction set" of simple procedures that comes with the pad add, transfer, etc. These are in the hardware: they do not change when we change the problem. If you like, they reflect the clerk's basic abilities. Then we have a set which is specific to the task, say calculating a salesman's commission. The elements of this set are built out of the instructions in the core set in ways we have discussed, and represent the combinations of the clerk's talents that will be required for him to carry out the task at hand.
The first thing we need to get the clerk to do is do things in the right order, that is, to follow a succession of instructions. We do this by designating one of the storage areas on the pad as a "program counter". This will have a number on it" which· indicates whereabouts in the calculational procedure the clerk is. As far as the clerk is concerned, the number is an address he knows that buried in the filing system is a special "instruction file" cabinet, and the number in the counter labels a card in that file which he has to go and get; on
the card is the instruction as to what he is to do next. So he gets the instruction and stores it on his pad in an area which we call the "instruction register".
File
I
Address
I
Instruction
I
Program Counter
Before he carries out the instruction, however, he prepares for the next one by incrementing the program counter; he does this simply by adding one to it. Then he does whatever the instruction in the register tells him to do. Using a bracketed notation where
0
means "contents of' - remember this, as we will be using it a lot - we can write this sequence of actions as follows3:Fetch instruction from address PC PC r (PC)
+
1Do instruction
The second line is a fancy way of saying that the counter PC "gets" the new
value (PC)+ 1. The clerk will also need some temporary storage areas on the pad; to enable him to do arithmetic, for example. These are called registers, and give him a place to store something while he goes and finds some other number. Even if you an! only adding two numbers you need to remember the first until you have fetched the second! Everything must be done in sequence and the registers allow us to organize things. They usually have names; in our case we will have four, which we call registers A, B and X, and the fourth, C, which is special - it can only store one bit of data, and we will refer to it as the "carry" register. We could have more or fewer registers -:- generally, the more you hav~,
the easier a program is to write - but four will suffice for our purposes.
=The conventions adopted for such "Register Transfer Language" vary according. to the whim of the author. We choose to follow the so-called "right to left" convention utilized in standard programming languages. [Editors]
So our clerk knows how to find out what he has to do, and when. Let's now look at the core instruction set for his pad. The first kind of instruction concerns the transfer of data from one card to another. For example, suppose we have a memory location M on the pad. We want to have an instruction that transfers the contents of register A into M:
Transfer (A) into M or M f- (A)
Similarly, we might want to go the other way, and write the contents of Minto A:
Transfer (M) into A or A f- (M)
M, incidentally, is not necessarily designed for temporary storage like A. We must also have analogous instructions for register B:
Transfer (B) to M Transfer (M) to B or or M f- (B) B f- (M)
Register X we will use a little differently. We shall allow transfers from B to X and X to B:
X f- (B) and B f- (X).
In addition, we need to be able to keep tabs on, and manipulate. our program counter
Pc.
This is obviously necessary: if the clerk shoots off to execute some multiplication, say, when he comes back he has to know what to do next he has to remember the number in PC. In fact, we'll keep it in register X. Thus we add the transfer instructions:PC f-(X) and X f- (PC).
Next, we need arithmetical and logical operations. The most basic of these is a "clear" instruction:
Clear A, or A f- O.
This means, whatever is in A, forget it, wipe it out. Then we need an Add operation:
Add B to A, or A r (A)
+
(B)This means that register A receives the sum of the contents of B and the
previous contents of A. We also have a shift operation, which will enable us to do multiplication without having to introduce a core instruction for it:
Shiftleft A and Shiftright A
The fIrst merely moves all the bits in A one place to the left. If this shift causes the leftmost bit to overflow we store it in the carry register C. We can also shift our number to the right; I have no use for this in mind, but it could come in handy!
The next instructions are logical ones. We will be looking at these in greater detail in the next chapter, but I will mention them here for completeness. There are three that will interest us: AND, OR and XOR. Each is a function of two digital "inputs" x and y. If both inputs are 1, then AND gives you 1; otherwise it gives you zero. As we will see, the AND operation turns up in binary addition, and hence multiplication;.if we view x and y as two digits we
are adding, then (x AND y) is the carry bit: it's only one if both digits are one. In terms of our registers, x and yare (A) and (B), and AND operates on these:
AND: A r (A)
A
(B),where we have used the logical symbol
A
for the AND operation. The result of acting on a pair of variables with an operator such as AND is often summarized in a "truth table" (Table 1.1.):A
B
X
0 0
0
0
1
0
X:=AAB
1
0
0
1 1
1
Our other two operators can be described in similar terms. The OR also operates on (A) and (B); it gives a one unless both (A) and (B) are zero - (x OR y) is one if x 'or
y
is one. XOR, or the "exclusive or", is similar to OR, except it gives zero if both (A) and (B) are one; in the binary addition of x and y, it corresponds to what you get if you add x to y and ignore any carry bits. A binary add of 1 and 1 is 10, which is zero if you forget the carry. We can introduce the relevant logical symbols:OR A f- (A)
V
(B) XOR A f- (A) $ (B)The actions of OR and XOR can also be summarized by truth tables:
A
B
X·A
B X
0
0
0
0
1 1 X=AVB 00
0
0
1 1 X=A EBB 10
1
10
1 1 1 1 1 10
OR
XOR
Table 1.2 The Truth Tables for the OR and XOR Operators
Two more operations that it turns out are convenient to have are the instructions to increment or decrement the contents of A by one:
Increment A, or A f-(A)
+
1 Decrement A. or A f- (A) - 1Obviously. one can go on adding instructions that mayor may not tum out to be very convenient. Here. we already have more. than the minimum number necessary to be able to do some useful calculations. However, we want to be able to do as much as possible, so we can bring in other instructions. One other that will be useful is one that allows us to put a data item directly into a register. For example, rather than writing California on a card and then
transferring from card to pad, it would be convenient to be able to write
Direct Load: B f- N,
where N is any constant.
There is one class of instructions that it is vital we add: that of branches, or jumps. A "jump to Z' is basically an instruction for the clerk to look in (instruction) location Z; that is, it involves a change in the value of the program counter by more than the usual increment of one. This enables our clerk to leap from one part of a program to another. There are two kinds of jumps, "unconditional" and "conditional". The unconditional jump we have touched on above:.
Jump to (Z) or PC f- (Z)
The really new thing is the conditional jump:
Jump to (Z)
if
C=lWith this instruction, the jump to location (2) is only made if the carry register C contains a carry bit. The freedom given by this conditional instruction will be vital to the whole design of any interesting machines.
There are many other kinds of jump we can add. Sometimes it turns out to be convenient to be able to jump not only to a definite location but to one a specific number of steps further on in the program. We can therefore introduce jump instructions that add this number of steps to the program counter:
Jump to (PC)
+
(Z) or PC f- (PC)+
(Z)Jump to (PC)
+
(Z) if C=lFinally, there is one more command that we need; namely, an instructio!J that tells our clerk to quit:
Halt.
With these instructions, we can now do anything we want and I will suggest some problems for you to practice on below. Before we do that, let us summarize where we are and what we're trying to do. The idea has been to
outline the basic computer operations and methods and indicate what is actually in a computer (I haven't been describing an actual design, but I've come close). In a simple computer there are only a few registers; more complex ones have more registers, but the concepts are basically the same, just scaled up a bit.
It is worth looking at how we represent the instructions we considered above. In our particular case the instructions contain two pieces: an instruction address and an instruction number, or "opcode":
Instruction
address
Instruction
opcode/number
For example, one of the instructions was "put the contents of memory Minto
register A". The computer doesn't speak English, so we have to encode this command into a fonn it can understand; in other words, into a binary string. This is the opcode, or instruction number, and its length clearly detennines how many different instructions we can have. If the opcode is a four-digit binary number, then we can have 24 =16 different instructions, of which loading the contents of a memory address into A is just one. The second part of the instruction is the instruction address, which tells the computer where to go to find what it has to load into A; that is, memory address M. Some instructions, such as "clear A", don't require an address direction.
Details such as how the instruction opcodes are represented or exactly how things are set out in ~emory are not needed to use the instructions. This is the first and most elementary step in a series of hierarchies, We want to be able to maintain such ignorance consistently. In other words, we only want to have to think about the lower details once and then design things so that the· next guy who comes along and wants to use your structure does not have to worry about the lower level details.
There is one feature that we have so far ignored completely. Our machine as described so far would not work because we have no way of getting numbers in and out. We must consider input and output. One quick way to go about things would be to assign a particular place in memory, say address 17642, to be the input, and attach it to a keyboard so that someone from outside the machine could change its contents. Similarly, another location, say 17644, might - be the output, which would be connected to a TV monitor or some other device,
so that the results of a calculation can reach the outside world.
Now there are two ways in which you can increase your understanding of these issues. One way is to remember the general ideas and then go home and try to figure out what commands you need and make sure you don't leave one out. Make the set shorter or longer for convenience and try to understand the tradeoffs by trying to do problems with your choice. This is the way I would do it because I have that kind of personality! It's the way I study to understand something by trying to work it out or, in other words, to understand something by creating it. Not creating it one hundred percent, of course; but taking a hint as to which direction to go but not remembering the details. These you work out for yourself.
The other way, which is also valuable, is to read carefully how someone else did it. I find the first method best for me, once I have understood the basic idea. If I get stuck I look at a book that tells me how someone else did it. I tum the pages and then I say "Oh, I forgot that bit", then close the book and carry on. Fimilly, after you've figured out how to do if you read how they did it and find out how dumb your solution is and how much more clever and efficient theirs is! But this way you can understand the cleverness of their ideas and have a framework in which to think about the problem. When I start straight off to read someone else's solution I find it boring and uninteresting, with no way of putting the whole picture together. At least, that's the way it works for me!
Throughout the book, I will !;uggest some problems for you to play with. You might feel tempted to skip thl~m. If they're too hard, fine. Some of them are pretty difficult! But you might skip them thinking that, well, they've probably already been done by somebody else; so what's the point? Well, of
course they've been done! But so what? Do them for the ftm of it. That's how
to learn the knack of doing things when you have to do them. Let me give you an example. Suppose I wanted to add up a series of numbers,
1+2+3+4+5+6+7 ...
up to, say, 62. No doubt you know how to do it; but when you play with this sort of problem as a kid, and you haven't been shown the answer ... it's fun trying to figure out how to do it. Then, as you go into adulthood, you develop a certain confidence that you can discover things; but if they've already been discovered, that shouldn't bother you at all. What one fool can do, so can another, and the fact that some other fool beat you to it shouldn't disturb you:
you should get a kick out of having discovered something. Most of the problems I give you in this book have been worked over many times, and many ingenious solutions have been devised for them. But if you keep proving stuff that others have done, getting confidence, increasing the complexities of your solutions for the fun of it - then one day you'll tum around and discover that nobody
actually did that one! And that's the way to become a computer scientist.
I'll give you an example of this from my own experience. Above, I mentioned summing up the integers. Now, many years ago, I got interested in the generalization of such a problem: I wanted to figure out formulae for the sums of squares; cubes, and higher powers, trying to find the sum of
m
things each up to the nth power. And I cracked it, finding a whole lot of nice relations.When I'd finished, I had a formula for each sum in terms of a number, one for each n, that I couldn't find a formula for. I wrote these numbers down, but I couldn't find a general rule for getting them. What was interesting was that they were integers, until you got to n=13 when it wasn't (it was something just over 691)! Very shocking! And fun.
Anyway, I discovered later that these numbers had actually been discovered back in 1746. So I had made it up to 1746! They were called "Bernoulli Numbers". The formula for them is quite complicated, and unknown in a simple sense. I had a "recursion relation" to get the next one from the one before, but I couldn't find an arbitrary one. So I went through life like this, discovering next something that had first been discovered in 1889, then something from 1921 ... and finally I discovered something that had the same date as when I discovered it. But I get so much fun out of doing it that I figure there must be 'others out there who do too, so I am giving you these problems to enjoy yourselves with. (Of course, eyeryone enjoys themselves in different ways.) I would just urge you not to be intimidated by them, nor put off by the fact that they've been done. You're unlikely to discover something' new without a lot of practice on old stuff, but further, you should get a heck of a lot of fun out of working out funny relations and interesting things. Also, if you read what the other fool did, you can appreciate' how hard it was to do (or not), what he was trying to do, what his problems were, and so forth. It's much easier to understand things after you've fiddled with them before you read the solution. So for all these reasons, I suggest you have a go.
Problem 1.1: (a) Go back to our dumb file clerk and the' problem of finding out the total number of sales in California. Would you advise the management to hire two clerks to do the job quicker? If so, how would you use them, and could you speed up the calculation by a factor of two? You have to think about
how the clerks get their instructions. Can you generalize your solution to K, or even 2K clerks?
(b) What kinds of problems can K clerks actualiy speed up? What kinds can they apparently not?
(c) Most present-day computers only have one central processor - to use our analogy, one clerk. This single file clerk sits there all day long working away like a fiend, taking cards in and out of the store like mad. Ultimately, the speed of the whole machine is determined by the speed at which the clerk - that is, the central processor - can do these operations. Let's see how we can maybe improve the machine's performance. Suppose we want to compare two n-bit numbers, where n is a large number like 1024; we want to see if they're the same. The easiest way for a single file clerk to do this would be to work through the numbers, comparing each digit in sequence. Obviously, this will take a total time proportional to n, the number of digits needing checking. But suppose we can hire n file clerks, or 2n or perhaps 3n: it's up to us to decide how many, but the number must be proportional to n. Now, it turns out that by increasing the number of file clerks we can get the comparison-time down to be proportional to log2 n. Can you see how? .
(d) If you can do this compare problem, you might like to try a harder one. See if you can figure out a way of adding two n-bit numbers in "log nit time. This is more difficult because you have to worry about the carries!
Problem 1.2: The second problem concerns getting the clerk to multiply (multiplication, remember, is not included in his basic instruction set). The problem comes in two parts. First, find the appropriate set of basic instructions required to perform mUltiplication. Having these, let's assume we save them some place in the machine so that we don't have to duplicate them every time we want to multiply; put them, say, in locations m to m+k. Show how we can give the clerk instructions to use this set-up to do a multiplication and return to the right place in the program.
1.3: Summary
We have now covered enough stuff for us to go on to understand any particular machine design. But instead of looking at any particular machine in detail we are going to do something rather different. From where we are now we can go
up, down or sideways. What do I mean by this? Well, "up" means hiding more details of the workings of the machine from the user - introducing more levels of abstraction. We have already seen some examples of this; for example, building up new operations such as mUltiplication from operations in our basic set. Every time we want to multiply we just use this multiply "subroutine". Another example worth discussing is the ability to talk about algebraic variables rather than locations in memory. Suppose you want to take the sum of X and Y, and call it Z:
z=x+y
X and Y are already known to the computer and stored at specific locations in memory. The first thing we have to do is assign some place in memory to store the value of Z and then ensure that this location holds the sum of the contents of the X and Y memory cells. Now we know all about Z and can use it in other expressions, such as
z+x.
It is clearly much simpler talking about algebraic variables rather than memory locations all the time although it is quite a job to set this up. However, up to now we have had to know exactly where a number is located in order to make a transfer. We can now introduce a new number Z, and say to the computer "I want a number Z find a place to put it and don't bother telling me where it is!" This is what I mean by moving "up".Of course, we already went "up" a bit when we summarized operations by instructions such as "Clear A", and so on. This sort of shorthand is introduced for our benefit, and programs written in it cannot be understood directly by the machine itself. Such "assembly language" programs have to be . translated into a "machine language" that the computer' can understand, and this is done by a program called an "assembler". The next level up, where we have multiplication and variables and so on; needs another program to translate these "high-level" programs into assembly language. These translation programs are called "compilers" or "interpreters". The difference between them is in when the translation is done. An interpreter works out what to do step by step, as the program runs, interpreting each successive instruction in terms of the cruder language. A compiler takes the program as a whole and converts it all into assembly or machine language before the program is nIno Compilers have the advantage that, in some cases, looking at the whole "code" it is possible for them to find clever ways to simplify the required operations. This is the nub of the important field of "compiler optimization" and is becoming of increasing importance for the new types of "norr~Von Neumann" parallel computers.