Let’s look at an example shown in Listing 9-4. The expression in the third line will be computed as follows:
1. The value of i will be converted to float (of course, the variable itself will not change);
2. This value is added to the value of f, the resulting type is float again; and 3. This result is converted to double to be stored in d.
Listing 9-4. int_float_conv.c int i;
float f;
double d = f + i;
All these operations are not free and are encoded as assembly instructions. It means that whenever you are acting on numbers of different formats, it probably has runtime costs. Try to avoid it especially in cycles.
9.1.5 Pointers
Given a type T, one can always construct a type T*. This new type corresponds to data units which hold address of another entity of type T.
As all addresses have the same size, all pointer types have the same size as well. It is specific for architecture and, in our case, is 8 bytes wide.
Using operands & and * one can take an address of a variable or dereference a pointer (look into the memory by the address this pointer stores). Listing 9-5 shows an example.
In section 2.5.4 we discussed a subtle problem: if a pointer is just an address, how do we know, the size of a data entity we are trying to read starting from this address? In assembly, it was straightforward: either the size could have been deduced based on the fact that two mov operands should have the same size or the size should have been explicitly given, for example, mov qword [rax], 0xABCDE. Here the type system takes care of it: if a pointer is of a type int*, we surely know that dereferencing it produces a value of size sizeof(int).
Listing 9-5. ptr_deref.c int x = 10;
int* px = &x; /* Took address of `x` and assigned it to `px` */
*px = 42; /* We modified `x` here! */
printf( "*px = %d\n", *px ); /* outputs: '*px = 42' */
printf( "x = %d\n", x ); /* outputs: 'x = 42' */
When you program in C, pointers are your bread and butter. As long as you do not introduce a pointer to non-existing data, the pointers will serve you right.
A special pointer value is 0. When used in pointer context (specifically, comparison with 0), 0 signifies “a special value for a pointer to nowhere.” In place of 0 you can also write NULL, and you are advised to do so. It is a common practice to assign NULL to the pointers which are not yet initialized with a valid object address, or return NULL from functions returning an address of something to make the caller aware of an error.
■
Is zero a zero? there are two contexts in which you might use the
0expression in C. the first context expects just a normal integer number. the second one is a pointer context, when you assign a pointer to 0 or compare it with 0. In the second context
0does not always mean an integer value with all bits cleared, but will always be equal to this “invalid pointer” value. In some architectures it can be, for example, a value with all bits set. But this code will work no matter the architecture because of this rule:
int* px = ... ;
if ( px ) /* if `px` is not NULL */
if ( px == 0 ) /* same thing as the following: */
if (!px ) /* if `px` is NULL */
There is a special kind of pointer type: void*. This is the pointer to any kind of data. C allows us to assign any type of pointer to a variable of type void*; however, this variable cannot be dereferenced. Before we do it, we need to take its value and convert to a legit pointer type (e.g., int*). A simple cast is used to do it (see section 9.1.2). Listing 9-6 shows an example.
Listing 9-6. void_deref.c int a = 10;
void* pa = &a;
printf("%d\n", *( (int*) pa) );
You can also pass a pointer of type void* to any function that accepts a pointer to some other type.
Pointers have many purposes, and we are going to list a couple of them.
• Changing a variable created outside a function.
• Creating and navigating complex data structures (e.g., linked lists).
• Calling functions by pointers means that by changing pointer we switch between different functions being called. This allows for pretty elegant architectural solutions.
Pointers are closely tied with arrays, which are discussed in the next section.
9.1.6 Arrays
In C, an array is a data structure that holds a fixed amount of data of the same type. So, to work with an array we need to know its start, size of a single element and the amount of elements that it can store. Refer to Listing 9-7 to see several variations of array declaration.
Listing 9-7. array_decl.c
/* This array's size is computed by compiler */
int arr[] = {1,2,3,4,5};
/* This array is initialized with zeros, its size is 256 bytes */
long array[32] = {0};
As the amount of elements should be fixed, it cannot be read from a variable.3To allocate memory for such arrays whose dimensions we do not know in advance, memory allocators are used (which are even not always at your disposal, for example, when programming kernels). We will learn to use the standard C memory allocator (malloc / free) and will even write our own.
You can address elements by index. Indices start from 0. The origins of this solution is in the nature of address space. The zero-th element is located at an array’s starting address plus 0 times the element size.
Listing 9-8 shows an array declaration, two reads and one write.
Listing 9-8. array_example_rw.c int myarray[1024];
int y = myarray[64];
int first = myarray[0];
myarray[10] = 42;
If we think for a bit about the C abstract machine, the arrays are just continuous memory regions holding the data of the same type. There is no information about type itself or about the array length. It is fully a programmer’s responsibility to never address an element outside an allocated array.
Whenever you write the allocated array’s name, you are actually referring to its address. You can think about it as a constant pointer value. Here is the place where the analogy between assembly labels and variables is the strongest. So, in Listing 9-8, an expression myarray has actually a type int*, because it is a pointer to the first array element!
It also means that an expression *myarray will be evaluated to its first element, just as myarray[0].
9.1.7 Arrays as Function Arguments
Let’s talk about functions accepting arrays as arguments. Listing 9-9 shows a function returning a first array element (or -1 if the array is empty).
3Until C99; but even nowadays variable length arrays are discouraged by many because if the array size is big enough, the stack will not be able to hold it and the program will be terminated.
Listing 9-9. fun_array1.c
int first (int array[], size_t sz ) { if ( sz == 0 ) return -1;
return array[0];
}
Unsurprisingly, the same function can be rewritten keeping the same behavior, as shown in Listing 9-10.
Listing 9-10. fun_array2.c
int first (int* array, size_t sz ) { if ( sz == 0 ) return -1;
return *array;
}
But that’s not all. You can actually mix these and use the indexing notation with pointers, as shown in Listing 9-11.
Listing 9-11. fun_array3.c
int first (int* array, size_t sz ) { if ( sz == 0 ) return -1;
return array[0];
}
The compiler immediately demotes constructions such as int array[] in the arguments list to a pointer int* array, and then works with it as such. Syntactically, however, you can still specify the array length, as shown in Listing 9-12. This number indicates that the given array should have at least that many elements. However, the compiler treats it as a commentary and performs no runtime or compile-time checks.
Listing 9-12. array_param_size.c
int first( int array[10], size_t sz ) { ... }
C99 introduced a special syntax, which corresponds essentially to your promise given to a compiler, that the corresponding array will have at least that many elements. It allows the compiler to perform some specific optimizations based on this assumption. Listing 9-13 shows an example.
Listing 9-13. array_param_size_static.c int fun(int array[static 10] ) {...}
9.1.8 Designated Initializers in Arrays
C99 introduces an interesting way to initialize the arrays. It is possible to implicitly initialize an array to default values except for those on several designated positions, for which other values are provided. For example, to initialize an array of eight int elements to all zeros, except for the indices 1 and 5 which will hold values 15 and 29, respectively, the following code might be used:
int a[8] = { [5] = 29, [1] = 15 };
The initialization order is irrelevant. It is often useful to use enum values or character values as indices.
You can define your own types using existing types via the typedef keyword.
The code shown in Listing 9-15 is creating a new type mytype_t. It is absolutely equivalent to unsigned short int except for its name. These two types become fully interchangeable (unless later someone changes the typedef).
Listing 9-15. typedef_example.c typedef unsigned short int mytype_t;
You can see the suffix _t in type names quite often. All names ending with _t are reserved by POSIX standard.4
This way newer standards will be able to introduce new types without the fear of colliding with types in existing projects. So, using these type names is discouraged. We will speak about practical naming conventions later.
What are these new types for?
1. Sometimes they improve the ease of reading code.
2. They may enhance portability, because to change the format of all variables of your custom type you should only change the typedef.
3. Types are essentially another way of documenting program.
4. Type aliases are extremely useful when dealing with function pointer types because of their cumbersome syntax.
4POSIX is a family of standards specified by the IEEE Computer Society. It includes the description of utilities, application programming interface (API), etc. Its purpose is to ease the portability of software, mostly between different branches of UNIX-derived systems.
A very important example of a type alias is size_t. This is a type defined in the language standard (it requires including one of the standard library headers, for example, #include <stddef.h>). Its purpose is to hold array lengths and array indices. It is usually an alias for unsigned long; thus, in Intel 64 it typically is an unsigned 8-byte integer.
■