Computational Physics
Wandering along, and down, lists and trees
Comments and questions to John Rowe.
Lists, etc.
Allocating memory is the easy bit and in reality is just the first of three tasks: 1. Getting more memory.
2. Keeping track of the memory we have obtained. 3. Releasing it when we have finished with it.
We have already covered 1 (malloc) and 3 (free); this section is about 2.
The problem: keeping track of memory
To remind ourselves: we can call malloc as often as we like to give ourselves more and more structures but there's a problem: once we have lost the value of a pointer to memory allocated by malloc there is no way to find it out again so how do we remember where all our structures are?
The following code illustrates this problem: it allocates memory for n structures but promptly forgets where the first n-1 are stored:
for (i = 0; i < n; ++i) p = malloc(sizeof *p);
The first n-1 structures are thus inaccessible: we have allocated the memory but forgotten where it was.
Last week we saw one way of addressing this problem, a dynamically allocated array of pointers. This week we shall look at another: linked lists.
[Class exercise.]
Linked lists
One answer to this problem is to make a list of pointers. To make a list we need to add a new element to our structure: a pointer to the next one. The pointer is NULL if there is no next one, ie this element is the last one in the list. Consider the following structure to represent a particle:
typedef struct particle { double position[NDIMS]; double velocity[NDIMS]; double mass;
struct particle *next; } Particle;
Particle *firstparticle = NULL;
The members all make sense except for the final one: a pointer to another structure of the same type. This new pointer, next, is for our internal purposes only, it has no information about that actual particle.
Creating a linked list
Initially firstparticle is NULL. Now consider the effect of the following functions: Particle *allocateparticle() { Particle *p = xmalloc(sizeof *p); int i;
School of Physics
It is vitally important that firstparticle is initialised to NULL. This happens automatically for external("global") variables as in this example but would have to be done be done
explicitly if firstparticle were declared inside a function
printf("Please enter the %d particle coordinates\n", NDIMS); for(i = 0; i < NDIMS; ++i)
scanf("%lg", &p->position[i]);
printf("Please enter the %d velocity components\n", NDIMS); for(i = 0; i < NDIMS; ++i)
scanf("%lg", &p->velocity[i]); printf("And finally the mass\n"); scanf("%lg", &p->mass); return p; } void newparticle() { Particle *p = allocateparticle(); p->next = firstparticle; firstparticle = p; }
The first function, allocateparticle(), is just allocating the memory, reading in the data for the particle and returning the address of the new particle. The interesting bit is in the (very short) function newparticle(). We see that at the end of the function firstparticle is no longer NULL, it is the address of the first particle. But firstparticle->next is NULL.
Were we to call newparticle() again, firstparticle would now be the newer particle, firstparticle->next the original particle and firstparticle->next->next NULL. We are building ourselves a list of particles.
Non-global linked lists
As firstparticle is a global our newparticle() function is able to change it internally. Often, however, firstparticle may be declared inside a function such as main() in which case newparticle() won't be able to change it and we would just move that code into main().
Another possibility is that we may not have a single list of particles, but may have a collection of systems each with its own set of particles:
typedef struct system { double temperature; Particle *particles; struct system *next; } System;
System *firstsystem = NULL;
Note that in this more complicated situation our primary list is the list of Systems each of which has its own list of Particles. We would not have a global variable called firstparticle as there is now no single global particle list.
Printing it out
Given the following function that prints out the details of a single particle: void printparticle(Particle *p) {
int i;
printf("Mass: %g\n", p->mass);
printf("Position (%g", p->position[0]); for(i = 1; i < NDIMS; ++i)
printf(", %f", p->position[i]); printf(")\n");
printf("Velocity (%g", p->velocity[0]); for(i = 1; i < NDIMS; ++i)
printf(", %f", p->velocity[i]); printf(")\n\n");
}
we may use the following simple loop to print out every particle: Particle *p;
for (p = firstparticle; p != NULL; p = p->next) printparticle(p);
Ordering the creation of our list
The previous code creates our list in reverse order: the first particle in the list, and the first printed, is the last input. The following function creates the list in order of mass, from the most massive down to the least massive:
void newparticle_sorted() {
Particle *p =allocateparticle(); /* Where do we put it? */
if ( firstparticle == NULL || firstparticle->mass < p->mass) { p->next = firstparticle;
firstparticle = p; }
else {
Particle *heavier = firstparticle;
while (heavier->mass > p->mass && heavier->next != NULL) heavier = heavier->next;
p->next = heavier->next; heavier->next = p;
} }
Note how we have carefully checked that firstparticle was not NULL before looking at firstparticle->mass. After that we just look for the last particle that is heavier than the new one and slot the new particle in after that.
Removing and freeing a member of a list
This introduces an extra complication: as well as freeing the structure itself we also have to remove it from the list. To remove the first member of a list we simply set the new first member to old_first->next and free
old_first. For every member except the first we find its predecessor and change its next pointer to the next of the one we want to free:
Initial list
A -> B -> C -> D -> NULL
After removing 'B'
A -> C -> D -> NULL
free(B)After removing 'A'
C -> D -> NULL
free(A)The following code does this and returns the new first particle (which may well be the same as the old first particle): /*
* Free Particle "togo" and return the new first Particle * "first" must be the existing first member of the list */
Particle *freeparticle(Particle *togo, Particle *first) { Particle *newfirst = first;
if (togo == NULL) return newfirst; if (togo == first)
newfirst = first->next;
else { /* Find its predecessor */
while (first != NULL && first->next != togo)
This process is fairly close to just being the reverse to the process of creating the list in order/
first = first->next; if (first == NULL) {
fprintf(stderr, "%p isn't in the list!\n", togo); return newfirst; } first->next = togo->next; } free(togo); return newfirst; }
This form is suitable either for a global linked list:
firstparticle = freeparticle(thisone, firstparticle); Or the "Systems" example:
sys->particles = freeparticle(thisone, sys->particles);
Recursion
To iterate is human, to recurse divine. Computer Science saying
God is subtle but he is not malicious. Einstein
Functions that call themselves
Recursion is an extremely simple concept: a function simply calls itself. Of course, a function can't always call itself, that would create an infinite loop, so all recursive functions have some sort of test before they call themselves.
void myfun(some_args) { /* Do some stuff */ if ( some_test)
myfun(some_other_args); }
A variation on the above is to do the test inside the function: void myfun(some_args) { if (I_didnt_need_to_be_called) return; /* Do some stuff */ myfun(some_other_args); }
or you may need a combination of the two.
Example: geneology
Last week we defined the following structure to represent a person and their children: typedef long long ID;
typedef struct person { char *surname;
char **forenames;
struct person **children;
int childbuf_size; /* Size of children buffer */ struct person *parents[parent_max];
Gender gender; } Person;
We can then define the following function to print out the details of one person: void printperson(Person *p) {
int j;
printf("%s,", p->surname);
for(j = 0; p->forenames[j] != NULL; ++j) printf(" %s", p->forenames[j]);
printf(" (%s), %d children\n", gender_name[p->gender], p->n_children); }
To print a person and their descendants
If we wished to print out a person and all of their descendants the algorithm would look like this: 1. Print their name and other details.
2. Foreach child:
Print this person and their descendants
(notice that in this case the test to call itself is an implicit one: the list of children might be of zero length.) The function looks like this:
/*
* Print a person and their descendants */
void printdescendants(Person *p, int depth) { int sp, child; if (p == NULL) return; for(sp = 0; sp < depth; ++sp) fputs(" ", stdout); printperson(p);
for(child = 0; child < p->n_children; ++child) printdescendants(p->children[child], depth + 1); }
There are a couple of things to notice before the main part of the function. The first is that we check that the Person argument is not NULL. This is pure paranoia and hence is extremely wise.
Second we print out some spaces before the name to indicate the depth.
Then it's simple: print this person's details and call the whole function for each of this person's children, increasing the depth parameter by one.
When recursion goes wrong
The biggest problem is the internal test or tests: either the recursion never stops or some possibilities get missed out. Consider the following rather over-the-top way of calculating a factorial:
int factorial(int i) { if ( 1 > 0 ) // Bug!
return i * factorial(i - 1); return 1;
}
we can see what it's meant to do but we've written '1' instead of 'i' so it will go on for ever. The first stage of debugging is just to print out its arguments:
int factorial(int i) {
if ( 1 > 0 ) // Bug!
return i * factorial(i - 1); return 1;
}
We can get even more details by following the example of the "descendants" function above and adding an extra debugging argument to tell us the depth:
int factorial(int i, int depth) { int d;
for(d = 0; d <= depth; ++d) puts(" ", stderr);
fprintf(stderr, "i: %d (depth %d)\n", i, depth); if ( depth > 12) {
fprintf(stderr, "Too deep!\n"); exit(-1); } if ( 1 > 0 ) // Bug! return i * factorial(i - 1); return 1; }
Here we have done two things:
We have added an extra 'depth' argument which we increment by one every time. We have indented the output according to the depth.
The latter isn't necessary but it is nice.
Be prepared to add some debugging output to your recursive routine.
Using a wrapper
It may be very inconvenient to add an extra argument to your routine because it is called elsewhere, there are header files, etc. In that case just create a simple wrapper:
int factorial(int i) {
return factorial_debug(i, 0); }
Hybrid structures
With our particles/systems example we saw two ways of listing all of our particles. In the first example, where there were no systems, we simply had a large list of particles. When we introduced systems we gave each system its own individual list of particles and there was no global particle list any more.
Thus, we don't always need to have a list of everything, often we can construct a list of "top" items and descend their trees of children or "owned" objects.
In our previous example of a geneology, it's clear that almost everybody will have a known biological mother. We may therefore choose to restrict our "people" list to people who do not have a known biological mother. If write a variation on the previous function so that it only prints a person's descendants if she is female we have a way of reaching everybody.
The following modification to out newperson() function only adds the new person to the top-level list if their mother is unknown or not female. (We have renamed our npeople variable n_motherless.)
/*
* If necessary, add to list of motherless people */
if (mother == NULL || mother->gender != female) { if (n_motherless == personbuf_size ) {
personbuf_size += 128;
people = realloc(people, personbuf_size * sizeof *people); if (people == NULL) {
fprintf(stderr, "Out of memory!\n"); exit(99);
} }
people[n_motherless] = pers; ++n_motherless;
}
The following function prints a person and, if she is female, her children: /*
* Print a person and, if they are female, their descendents */
void printby_mother(Person *p, int depth) { int sp, child; if (p == NULL) return; for(sp = 0; sp < depth; ++sp) fputs(" ", stdout); printperson(p); if (p->gender == female)
for(child = 0; child < p->n_children; ++child) printby_mother(p->children[child], depth + 1); }
And the following snippet then prints out every person. for(i = 0; i < n_motherless; ++i) printby_mother(people[i], 0);
It's extremely common to have some aspect of the actual structure of our objects reflected in the way we store and access them.
Bitwise operators
It's unlikely you will have a lot of use for bitwise operators (operators that operate on the individual bits of a variable) but if you do, here they are:
Operator Description & Bitwise and (both bits are one)
| Bitwise or (either or both bits are one)
^ Bitwise exclusive or (just one bit is one)
<< Left shift (multiplies by power of 2)
>> Right shift (divides by power of 2)
~ One's complement
NB the One's complement operator takes a single argument, the others two.
Examples
We take the example of an unsigned char (one byte, ie 8 bits). Curiously there is no way to write numbers as binary in C; the nearest is hexadecimal, but for clarity we shall show the calculation in binary.
Expression Calculation 00110011 & 11110000 00110011 11110000 ---00110000 00110011 | 11110000 00110011 11110000 ---11110011 00110011 ^ 11110000 00110011 11110000 ---11000011 10110011 << 2 10110011
---11001100 11110000 >> 3 11110000 ---00011110 ~11110000 11110000 ---00001111
Binary flags
The most common use of binary operators is to define flags, ie options that are either on or off. Typically there's a header file with various constants each of which is an exact power of 2 (ie has only one bit set):
#define OPTION1 0x01 /* 1 binary 00000001 */ #define OPTION2 0x02 /* 2 binary 00000010 */ #define OPTION3 0x04 /* 4 binary 00000100 */ #define OPTION4 0x08 /* 8 binary 00001000 */ #define OPTION5 0x10 /* 16 binary 00010000 */ #define OPTION6 0x20 /* 32 binary 00100000 */ The code sets, unsets and tests options as follows:
unsigned int flags; int main(void) {
flags |= OPTION3; /* Set OPTION3 */ flags &= ~OPTION4; /* Unset OPTION4 */
if ( flags & OPTION3)
printf("Option 3 is set\n");
}
The setting and testing of flags is fairly clear, the unsetting of OPTION4 is a little more complicated: OPTION4, like all options, has just one bit set so ~OPTION4 has every bit except that one set. So flags &= ~OPTION4 has the following effect:
The bit that was one in OPTION4 is zero ~OPTION4 so the binary and for that bit must always be zero. The bit that was zero in OPTION4 is one ~OPTION4 so the binary and for that bit will be one if and only if the corresponding bit in flags was one. That is to say, it is not changed.
Example in binary 00111011 & ~00001000 ~00001000 = 11110111 00111011 11110111 ---00110011
Example
In our "people" example we might want to add a flags variable to our structure: unsigned long flags;
We might then have:
#define FLAG_DECEASED 0x01
and a function to write a letter to somebody would start with the statement: if ( person->flags & FLAG_DECEASED )
return; /* Do not write to dead people! */