Operators
19.2 Strings, Arrays and Pointers
A string is really an array of characters. It is stored at some place the memory and is given an end marker which standard library functions can recognize as being the end of the string. The end marker is called the zero (orNULL) byte because it is just a byte which contains the value zero: ‘\0’. Programs rarely gets to see this end marker as most functions which handle strings use it or add it automatically.
Strings can be declared in two main ways; one of these is as an array of characters, the other is as a pointer to some pre-assigned array. Perhaps the simplest way of seeing how C stores arrays is to give an extreme example which would probably never be used in practice. Think of how a string called
stringmight be used to to store the message "Tedious!". The fact that a string is an array of characters might lead you to write something like:
#define LENGTH 9; main () { char string[LENGTH]; string[0] = ’T’; string[1] = ’e’; string[2] = ’d’; string[3] = ’i’; string[4] = ’o’; string[5] = ’u’; string[6] = ’s’; string[7] = ’!’; string[8] = ’\0’; printf ("%s", string); }
This method of handling strings is perfectly acceptable, if there is time to waste, but it is so laborious that C provides a special initialization service for strings, which bypasses the need to assign every single character with a new assignment!. There are six ways of assigning constant strings to arrays. (A constant string is one which is actually typed into the program, not one which in typed in by the user.) They are written into a short compilable program below. The explanation follows.
/**********************************************************/
/* */
/* String Initialization */
/* */
/**********************************************************/
Strings, Arrays and Pointers 179
char global_string2[] = "Declared as an array";
main ()
{ char *auto_string = "initializer...";
static char *stat_strng = "initializer...";
static char statarraystr[] = "initializer....";
/* char arraystr[] = "initializer...."; IS ILLEGAL! */ /* This is because the array is an "auto" type */ /* which cannot be preinitialized, but... */
char arraystr[20];
printf ("%s %s", global_string1, global_string2);
printf ("%s %s %s", auto_string, stat_strng, statarraystr); }
/* end */
The details of what goes on with strings can be difficult to get to grips with. It is a good idea to get revise pointers and arrays before reading the explanations below. Notice the diagrams too: they are probably more helpful than words.
The first of these assignments is a global, static variable. More correctly, it is a pointer to a global, static array. Static variables are assigned storage space in the body of a program when the compiler creates the executable code. This means that they are saved on disk along with the program code, so they can be initialized at compile time. That is the reason for the rule which says that only static arrays can be initialized with a constant expression in a declaration. The first statement allocates space for a pointer to an array. Notice that, because the string which is to be assigned to it, is typed into the program, the compiler can also allocate space for that in the executable file too. In fact the compiler stores the string, adds a zero byte to the end of it and assigns a pointer to its first character to the variable called
global_string1.
The second statement works almost identically, with the exception that, this time the compiler sees the declaration of a static array, which is to be initialized. Notice that there is no size declaration in the square brackets. This is quite legal in fact: the compiler counts the number of characters in the initialization string and allocates just the right amount of space, filling the string into that space, along with its end marker as it goes. Remember also that the name of the array is a pointer to the first character, so, in fact, the two methods are identical.
The third expression is the same kind of thing, only this time, the decla- ration is inside the functionmain()so the type is not static but auto. The difference between this and the other two declarations is that this pointer variable is created every time the function main()is called. It is new each time and the same thing holds for any other function which it might have been defined in: when the function is called, the pointer is created and when it ends, it is destroyed. The string which initializes it is stored in the exe- cutable file of the program (because it is typed into the text). The compiler returns a value which is a pointer to the string’s first character and uses that as a value to initialize the pointer with. This is a slightly round about way of defining the string constant. The normal thing to do would be to declare the string pointer as being static, but this is just a matter of style. In fact this is what is done in the fourth example.
The fifth example is again identical, in practice to other static types, but is written as an ‘open’ array with an unspecified size.
The sixth example is forbidden! The reason for this might seem rather trivial, but it is made in the interests of efficiency. The array declared is of type auto: this means that the whole array is created when the function is called and destroyed afterwards. auto-arrays cannot be initialized with a string because they would have to be re-initialized every time the array were created: that is, each time the function were called. The final example could be used to overcome this, if the programmer were inclined to do so. Here an auto array of characters is declared (with a size this time, because there is nothing for the compiler to count the size of). There is no single assignment which will fill this array with a string though: the programmer would have to do it character by character so that the inefficiency is made as plain as possible!