• No results found

A set of words for local variable handling is provided by ANS Forth. Sadly, the standard does not provide what is required for interfacing with operating systems such as Windows, especially in handling local arrays. It also uses a very clumsy notation that is hard to read. However, it did follow practice of the time. The most popular alternative notation is described below. Not all systems provide it and several variants exist.

MPEism: The description below is of the MPE implementation, which is broadly similar to many others.

The sequence

: <name> { ni1 ni2 ... | lv1 lv2 ... -- o1 o2 } …

;

defines named inputs, local variables, and outputs. The named inputs are automatically copied from the data stack on entry. Named inputs and local variables can be referenced by name within the word during compilation. The output names are dummies to allow a complete stack comment to be generated.

• The items between { and | are named inputs.

• The items between | and -- are local variables.

• The items between -- and } are outputs.

Named inputs and locals return their values when referenced, and must be preceded by

TO to perform a store, or by ADDR to return the address. Arrays may be defined in the form:

arr[ n ]

Any name ending in the '[' character will be treated as an array, the expression up to the terminating ']' will be interpreted as the size. Arrays always return their base addresses, all operators are ignored.

In the example below, a and b are named inputs, a+b and a*b are local variables, and

arr[ is a 10 byte array.

: foo { a b | a+b a*b arr[ 10 ] -- } arr[ 10 erase a b + to a+b a b * to a*b cr a+b . a*b . ;

Cautionary notes

In the following discussion the term “Forth locals” refers to both named inputs and local variables.

Writing C in Forth: Although use of Forth locals can be valuable for local arrays and readability, there is a great danger for C programmers learning Forth to overuse them.

Common extensions

There are some in the Forth community who believe that local variables have no place in a course or text until the second or third level. I have seen enough Forth that looks like C to know that there is a real problem. Because of the benefits when interfacing to modern operating systems and in certain classes of problems, I reluctantly decided to include them.

Excessive use of Forth locals inhibits learning how to use the data stack efficiently and reduces the incentive to factor into small definitions. In turn, that leads to "cut and paste" errors and to bigger code, which further leads to difficulties in maintenance and debugging. I also note that programmers who use Forth locals heavily tend not to use defining words and other more advanced Forth techniques.

Performance: The Forth virtual machine is optimised for two stacks, and the code generation of modern Forth compilers reflects this. Especially on CPUs with more than eight registers, good stack code is faster and smaller than code with heavy use of local variables.

I recently overhauled parts of a TCP/IP stack and removed Forth locals where possible. After testing on an ARM (16 registers) embedded system, the size of rewritten words reduced by 20% and performance improved by 50%. In particular cases code size reduced by 50% or more. Code size improves because the compiler makes better use of CPU registers and performance improves because of smaller code and reduced memory traffic. Although less dramatic we have similar results for most CPUs. For CPUs with 32 or more registers, e.g. SPARC and PowerPC, Forth compiler writers can easily use registers for local variables.

Writers of Forth compiler are unlikely to put in a huge effort to optimise bad code. After an earlier release of this book, the following comments were made on the comp.lang.forth newsgroup comparing C and Forth compilers.

Anton Ertl: “You might be surprised; as long as the bad code occurs in sufficiently important benchmarks, they are very likely to put in a huge effort. However, in the case of Forth, optimizing stack code will benefit all the code out there (including code using locals), whereas optimizing locals will only benefit a minority of code; it should come as no surprise that Forth compiler implementors will concentrate on optimizing stack accesses first.”

Andrew Haley: “Well, it's not just that: we can't tell whether the ‘bad’ code has been written by a programmer or is the output of a previous compilation pass. So, we optimize everything we can, even if it's something no sane programmer would ever write.”

Avoiding locals: The main reason that people feel they need locals is having too many items in use on the data stack. There are three ways to avoid this:

1) Re-factor into small definitions that use fewer items on the stack, 2) Use the return stack to hold the least commonly used items,

3) Where items are pairs or larger, for example x/y coordinate pairs for points or x/y/w/h sets for a rectangle description, consider keeping them in structures and passing pointers to the structures. Although pointers increase memory traffic, they considerably reduce stack traffic at word entry and exit.

Common extensions

We rewrote an embedded GUI package after a client requested changes. The original code had been written with extensive use of Forth locals. After reorganisation and overhaul, only one word used local variables. The code is shorter both in terms of lines of source code and in terms of compiled code size. The code is easier to maintain.

When to use locals:

1) To avoid stack repetition of complex calculations. Some calculations have common sub-expressions with reused intermediate results. Storing these on either stack can lead to “stackrobatics”. In performance and code size, local variables are cheap compared to named inputs. Defining local variables for these can be very effective. 2) For temporary small buffers. The alternatives to local buffers are a heap, global or

task-based (thread local) structures. Although heap functions are widely available and standardised, they require code, have performance and reliability penalties (heap leaks are not unknown in any language) and require great care in exception handling. Local buffers are automatically discarded on exit from a word, and, in every Forth implementation I have inspected, they are completely compatible with

CATCH and THROW.