The End of History and the Last Programming Language

In April 1993 Guy Steele and I gave our “Evolution of Lisp” paper at the ACM History of Programming Languages II Conference (Steele and Gabriel 1993). The histories of the following languages were presented there: ALGOL 68, Pascal, Ada, Lisp, PROLOG, FORMAC, CLU, Icon, Forth, C, and C++. According to the program committee’s requirements, the only languages considered for the conference were those that had been in existence for a long time and have had a significant impact, measured in an indeterminate manner.

Two days before the conference started, Steele and I sat in his living room in Lexington, Massachusetts, trying to decide what to present—we had not yet pre- pared our talk, contrary to the strict rules of Jean Sammet, the conference chair- woman. We noted that with only two exceptions, every language being presented was dead or moribund. The exceptions were C and C++. We thought for a while about the history of Lisp and noticed that at that time, though there were new Lisp sprouts pushing up through the earth, Lisp could be characterized as having gone through two periods of expansive growth, the first followed by a minor consolidation and the second followed by a major consolidation which dwindled to its then-current stagnant state. This stagnation, it seemed, was typical of all programming languages, and we set out to figure why.

Even though we did come up with a theory that seemed to explain why the field of general-purpose programming languages is currently dead and predicted the death of a currently expanding language, we decided against presenting the material because it might be too controversial. Little did we know that during the concluding panel session of the conference—entitled “Programming Languages: Does Our Present Past Have a Future?”—the general remarks would hint that our feeling was shared by at least some in attendance. Our feeling was also bolstered by the average age of the papers’ authors, somewhere in the 50- to 60-year-old range.

In this essay I will present our theory, but first a word about general-purpose versus application-specific languages.

Languages can be broken into two categories: application specific and general purpose. Some languages are fairly obviously application specific, such as simula- tion packages, spreadsheet programming languages, and even 4GLs. In some ways these languages are the most prevalent, particularly if you consider COBOL to be an application-specific language. A case can be made that COBOL is such a language by arguing that it is applicable only to business-related computations such as payroll, inventory, and accounts receivable. When you lump COBOL into this category, the size of the general-purpose programming language population (in lines of code) is somewhat small. Later in this essay I will present some figures that show this.

If you group COBOL with the general-purpose programming languages, you have to conclude that it renders all the others essentially irrelevant, except one or two. Again, we’ll see the figures later.

There are four parts to the theory:

• Languages are accepted and evolve by a social process, not a technical or technological one.

• Successful languages must have modest or minimal computer resource requirements.

• Successful languages must have a simple performance model.

• Successful languages must not require users have “mathematical sophistication.”

Let’s look at how languages are accepted and evolve. My theory, which Steele seems to accept, is that a number of factors enable a language to flourish: • The language must be available on a wide variety of hardware. • It helps to have local wizards or gurus for the language. • It must be a minimally acceptable language.

• It must be similar to existing popular languages.

The process is social because it relies on the actions of people in response to a language. Many of us, particularly academics, hope that the nature of the language would be the dominant factor. I want to put people ahead of the language so that

we all realize it is not the abstract nature of the language that is critical but the way that human nature plays into the acceptance process.

The acceptance process is not unique to languages—it works well and sometimes better with other things such as application programs and operating systems.

The language must be on every platform the acceptance group needs—an acceptance group is a potential population of users that is large enough and cohesive enough to make a diﬀerence to the popularity of the language. This can be accomplished if developers port the language to every such computer, but often it is better to make a portable implementation or a particularly simple implementation and let local wizards or gurus do the porting for you.

It’s sometimes hard to get started with a new language—basics to master, tricks to learn. It helps to have a local wizard to help with this, and what better way to develop a wizard than porting the language to the local computers? Having a wizard can make a huge diﬀerence. Because information is local you have some control over when you can get it and how much you get—the source is not an impersonal voice on the other end of a phone or some words with a return e-mail address, nor is there some hidden problem more important than yours that keeps your informant from answering you.

With a local wizard who is at least a porter, both you and the wizard have a personal attachment to the new language through the power to control it—the language is not completely someone else’s master plan: you share in it. This is important for two reasons. First, people generally prefer to use something over which they are granted some degree of control; this is just human nature. It is especially diﬃcult to have accepted a language whose designers and proponents hold it out as an untouchable panacea. Second, this tends to reduce the risk by putting the key technology in your hands. Would you risk your job on a new language whose implementation might be wrong and out of reach?

This implies something else: A proprietary language has little chance of success. Of course there are exceptions—the most notable being Visual BASIC— but I think it would be hard to buck this logic.

Despite the required simplicity of the language, it must be acceptable, and there must be a reason to move to it. In general the reason is that it solves a new problem

well or makes certain tasks easier to do. Sometimes the only benefit needs to be that it makes existing tasks more reliably done.

The eﬀect of the whole process as outlined so far is that a simple language with a minimal implementation takes hold through the vehicle of solving a new problem, for example, along with the local wizardry to make such a change sufficiently risk-free. Then, when the language is established, it can be improved. Improve- ments are always easier to accept than a new language is.

This last point is important. People who design and promote an exotic new language always try to defeat in the minds of users a simpler language by explain- ing why that simple language is no good, vastly inferior to the new one. But in time, the simpler one usually is improved, eroding the argument. Furthermore, if the simpler language is entrusted to local wizards, they will not only port it but improve it. In eﬀect, the simpler language traveling by way of popular demand has an army of people trying to improve it; all the more reason proprietary languages are crippled in the language sweepstakes.

If we think about this model as developed so far, it resembles that of a virus. A simple organism that can live anywhere eventually attaches itself to widespread host organisms and spreads.

The language must be similar to existing languages. I think some people find this point hard to understand because it sounds too conservative or pessimistic. But I compare it to two of the major processes for change we see in the world around us: the free market and evolution.

The free market works like this: It is in business’s best interest to have frequent small releases that cost money. Innovation costs too much and is too risky. If the innovation is technological, it is least risky to apply the innovation conservatively and then improve or expand its application. This way you can spend either a lot of money or most of your money on quality, which builds brand loyalty. Much of the automotive and consumer electronics markets work like this. (More on the general workings of free markets will be explored in the essay, “Money Through Innovation Reconsidered.”)

This works well with languages: With a simple language and implementation, you can spend time to make sure that it just plain works the first time every time. If you need to innovate over and over just to get the thing oﬀ the ground, its quality will be low and peoples’ first taste will be a bad taste.

In the marketplace, adapting to innovation requires a long learning curve, and rarely is a compelling payoﬀ in sight for the learning. People had trouble learning Lisp in the 1980s even when they believed they would achieve extraordi- nary productivity gains and when the price of admission to the artificial intelli- gence sideshow was a working knowledge of Lisp. Because of the diﬃculty

learning Lisp, they were unable to learn it properly and gained almost none of the benefits. Also, it was never clearly shown that the higher productivity of Lisp programmers wasn’t due to their greater natural ability—let’s say better mathematical aptitude—of those who could learn Lisp.

The free market works like this because people don’t like to take risks, especially by adopting a new language or any new way of doing things. People prefer incremental improvement over radical improvement. We are conditioned for this in all sorts of ways. Very little in public education prepares us for it. Our mothers didn’t teach us to risk. And who wants to risk a job needlessly by using an exotic new language—there is always a safe alternative in the language domain.

People don’t like to be different; being an early adopter of a radical thing is to really be different. When you’re different, people notice you and sometimes laugh, especially when you are different for no particular reason. Different is risky. Why bother?

We can take an analogy from evolution. (Remember, this is an analogy only and has no compelling argumentative force except to comfort us that this radically conservative view of how things progress is common.) People normally consider that the primary purpose of natural selection is to promote small improvements— and this does happen. But there is no inexorable push from single-celled organisms through rats through monkeys and apes to humans. Rather, progress is made in fits and starts, and the main purpose of natural selection is to reject harmful changes, not to promote large leaps.

Only those things that are relatively worthless change rapidly and dramatically. These worthless things provide variety: Someday an environmental change can make one of those changes or a related group of them important to survival, and this one or this group will then join the ranks of the natural-selection protected, and perhaps some formerly static parts will no longer be essential and thus can begin changing.

In other words, natural selection provides very little incremental improvement: Almost all dramatic change occurs because the environment changes and makes the worthless invaluable. This is part of a larger argument regarding a theory of technological improvement and adoption called Worse Is Better, and in the essay called “Money Through Innovation Reconsidered” I will expand the evolution argument.

When applying the lesson from evolution to programming language design, there are two important features of languages that I think cannot be changed at the moment: a simple performance model and a need for only minimal mathematical sophistication. This leads us to the last two points of our quadripartite theory.

The requirement to be on popular platforms has important technical fallout. Invariably the most prevalent commercial platforms are those that have been in the field for a year or two or more, which implies that they are one to three technology generations behind current hot boxes. The language must run well on such computers, which means it must not be a resource hog in any sense of the word, and in fact, it must be comparatively tiny.

This also implies that the language cannot attempt to break new ground in language design if doing so would require resources. Furthermore, breaking new ground sometimes requires time and testing, which delays the release of the new language while competitors—new or old—become established. This also applies to systems such as operating systems, and the example I use to show this point comes from UNIX.

An operating system provides services to a user program, and such services are provided by system calls. However, in most operating systems the end user can interrupt the execution of a program. Typically this is handled in the operating system by saving the user-level program counter and minimal context information in a save area such as the stack. Notice, though, that the user-level program counter does not adequately capture the execution point when the program is executing a system call. For instance, if the program is opening a file for reading, the operating system may be allocating a buﬀer. The operating system designer has three choices: Try to back out of the call, try to save the true state, or return a flag from the system call that states whether the call succeeded. The first two are very hard to do technically. One operating system I know from the past required approximately 60 pages of assembly language code to back out of the call in all cases. UNIX does the third. This requires almost no coding by operating system implementers, but it does require each application writer to write code to check the flag and decide what to do. This behavior—considered quite ugly and, well, stupid in modern operating system circles—has not harmed UNIX’s success.

One can only imagine how much worse DOS must be.

The lesson is that you should choose carefully the problems you spend time solving by means of innovation. In this case, using innovation to solve the interrupt problem correctly served the purpose of merely bloating the operating system—the end users just did not care. The designers of the correctly functioning operating system probably thought that because computers do so many things correctly by design, they ought to be perfect. UNIX designers probably thought that it was OK for computers to be like anything else—lousy, and they were right: I’ve seen people grumble when they try to resume a UNIX program they’ve control-C’ed out of, but they just run the program again.

To a large extent the sophistication of a programming language is limited by the sophistication of the hardware, which may be high in terms of cleverness needed to

achieve speed but is low in other ways to maintain speed, such as inexpensive large memory systems and secondary storage.

One of the things I learned by watching people try to learn Lisp is that they could write programs that worked pretty well but they could not write fast programs. The reason was that they did not know how things were implemented and assumed that anything in the language manual took a small constant time. In real- ity, for example, there are many Lisp operations whose running times are linear in the length of one or several of their arguments. When you assume the constant- time model, the programs these novices wrote make perfect sense.

Another thing I learned from watching software businesses is that speed is king. In some cases a company would switch compilers for as little as 2% speed improvement and 1% of space, even though the cost of the change could be work years of eﬀort. People will sacrifice all sorts of excellent features and significant personal productivity for a fast tool: Notice that vi is the editor of choice for

UNIX, not Emacs, which is vastly “better” in some sense.

People can program eﬀectively and happily when they feel they have mastered their craft, which means when they can write fast programs. People are quite capable of understanding the von Neuman architecture, and so a language whose performance model matches that architecture is a shoo-in.

A final point is that the marketplace does not care how much it cost a company to produce a software product; what customers really care about is price and speed. If your program costs too much, forget it; if it’s too slow, forget it. The CEO who saves money on development by using a language with high productivity and (necessarily) poor performance is a fool.

The second mandatory feature is that the language cannot require mathematical sophistication from its users. Programmers are not mathematicians, no matter how much we wish and wish for it. And I don’t mean sophisticated mathematicians, but just people who can think precisely and rigorously in the way that mathematicians can.

For example, to most language theorists the purpose of types is to enable the compiler to reason about the correctness of a program in at least skeletal terms. Such reasoning can produce programs with no run-time type errors. Strict type

In document Patterns of Software: Tales from the Software Community - Free Computer, Programming, Mathematics, Technical Books, Lecture Notes and Tutorials (Page 122-133)