• No results found

4 Writing an Interpreter

4.2 Implementing a Simple Interpreter

4.2.4 A Note on Snarng and Bootstrapping

Two concepts worth knowing about language implementation aresnarfingandbootstrapping.

Snarng is \stealing" features from an underlying language when implementing a new language. Bootstrapping is the process of building a language implementation (or other system) by using the system to extend itself.

4.2.4.1 Snarng

Our example interpreter implements Scheme in Scheme, but we could have written it in C or assembly language. If we had done that, we'd have to have written our own read-eval-print loop,

and a bunch of not-very interesting code to read from the keyboard input and create data structures, display data structures on the screen, and so on. Instead, we \cheated" by snarng those features from the underlying Scheme system|we simply took features from the underlying Scheme system and used them in the language we interpret. Our tiny language requires you to type in Scheme lists, because it uses the Scheme read-eval-print to get its input and call the interpreter. If we wanted to, we could provide our own read routine that reads things in a dierent syntax. For example, we might read input that uses square brackets instead of parentheses for nesting, or which uses inx operators instead of prex operators. (That is, the middle item in a three-element list would be the operator name.)

There are some features we didn't just snarf, though|we wrote our own evaluation procedure which controls recursive evaluation. For example, we use basic Scheme arithemetic procedures to implement individual arithmetic operations, but we don't simply snarf them: the interpreter recognizes arithmetic operations in its input language, and maps them onto procedure calls in the underlying language. We can change our language by changing those mappings: for example, we could use the symbolssum, difference,product, and quotientinstead of +,-,*, and *. Or we

could use the same names, but implement the operations dierently. (For example, we might have our own arithmetic routines that allow a representation of innity, and do something reasonable for division by zero.)

We also use recursion to implement recursion, when we recursively call eval). But since we

coded that recursion explicitly, we can easily change it, and do something dierent. Our arithmetic expressions don't have to have the same recursive structure as Scheme expressions.

We could also implement recursion ourselves. As written, our tiny interpreter uses Scheme's recursion \stack" to implement it's own stack|each recursive call to evalimplements a recursive

call in our input language. We didn't have to do this. We could have implemented our own stack as a data structure, and written our interpreter as a simple non-recursive loop.

What counts as \snarng"? The term is a good one, but not clearly dened. We clearly just snarfed the Scheme reader, but we've done something a little dierent with recursion. We've done something very dierent with the interpretation of operator names.

4.2.4.2 Bootstrapping and Cross-compiling

Implementing a programming language well requires attention to the ne art of bootstrapping| how much of the system do you have to build \by hand" in some lower-level system, and how much can you build within the system itself, once you've got a little bit of it working.

Most Scheme systems are written mostly in Scheme, and in fact it's possible (but not particularly fun) to implement a whole Scheme system in Scheme, even on a machine that doesn't have a Scheme system yet.

How are these things possible?

First, let's take the simple case, where you're willing to write a little code in another language. You can write an interpreter for a small subset of Scheme in, say, C or assembler. Then you can extend that little language by writing the rest of Scheme in Scheme|you just need a simple little subset to get started, and then things you need can be dened in terms of things you already have. Writing an interpreter for a subset of Scheme in C is not hard|just a little tedious. Then you can use lambda to create most of the rest of the procedures in terms of simpler procedures.

Interestingly, you can also implement most of the dening constructs and control constructs of Scheme in Scheme, by writing macros, which we'll discuss later.

You can start out this way even if you want your Scheme system to use a compiler. You can write the compiler in Scheme, and use the interpreter to run it and generate machine code. Now you have a compiler for Scheme code, and can compile procedures so that they run faster than if you interpreted them. You can take most of the Scheme code that you'd been interpreting, and use the compiler to create faster versions of them. You then replace the old (interpreted) versions with the new (compiled) versions, and the system is suddenly faster.

Once the compiler works, you can compile the compiler, so that \em it runs faster. After all,

a compiler is just a program that takes source code as input and generates executable code|it's just a program that happens to operate on programs. Now you're set|you have a compiler that can compile Scheme code that you need to run, including itself, and you don't need the interpreter anymore.

To get Scheme to work on a new system, without even needing an interpreter, you can \em

cross-compile. If you have Scheme working on one kind of machine, but want to run it on another, you can write your Scheme compiler in Scheme, and have it run on one machine but generate code for the new machine. Then you can take the executable code it generates, copy it onto the new machine, and run it.

Most Scheme systems are built using tricks like this. For example, the RScheme system never had an interpreter at all. Its compiler was initially run in a dierent Scheme system (Scheme-48) and used to compile most of RScheme itself. This code was then used to run RScheme with no further assistance from another implementation.

The rst Scheme system was built by writing a Scheme interpreter in Lisp, or was it a compiler rst? ... blah blah ... ]