Storing the Data: the for Loop

Previously we coded up a while loop for reading in the lines of the weblog file. Whilst there is nothing wrong with this, we now have more information than we did then: the number of lines in the file. In this situation it seems logical to use a for loop instead of the while loop that we coded up in the last lesson.

Let us assume that we have a function called getIP for extracting the IP address from a line of the weblog and a function called getHTTP for extracting the HTTP code. We will now replace the original while loop (and the character buffer we defined) with a for loop and a set of dynamically defined character buffers to store the lines read in from the file.

A for loop is similar to a while loop except that it has three parts inside the parentheses, instead of just a single condition to be checked. The first part is a statement which initialises the for loop (usually it sets some counter to zero, or some other starting value). The second part is a condition to check each time the loop executes, similar to the condition for a while loop (usually this checks to see if the loop counter is still below some predetemined value). The third part of the for loop definition is a statement which is executed each time the loop executes (this is usually used to increment the loop counter).

A for loop is often used with arrays, since it can be made to loop a set number of times and so is useful for filling arrays with values, or completing other repetitive tasks where one has access to the number of times the task must execute. The following lines of code implement a for loop which allocates memory for all the logType structures, sets the pointers of our array to point to them in order, and fills the structures with data from the weblog file:

for (int counter = 0; counter < numLines; counter++)
    sortArray[counter] = calloc(sizeof(logType),1);
    sortArray[counter]->logline = calloc(sizeof(char),500);

Most of this should be self explanatory. Note that the variable counter is not declared prior to the for loop. One can declare a variable anywhere in a C program (so long as the declaration of the variable occurs before that variable is used), and inside the initialization part of a for loop is no exception.

Notice the arrow in the second expression inside the for loop. This dereferences the pointer (recall our array sortArray is an array of pointers) and allows-na us to access the field logline of the structure being pointed to.

The function calls to getIP and getHTTP pass the address of the ip and http fields, since these values need to be modified by the respective functions. Note that we first make a copy of the logline before sending it to getIP and getHTTP. This is because these functions will make use of the strtok function, which modifies the strings it operates on.

In the next lesson, we will actually write the getIP and getHTTP functions, and we will construct a function for comparing HTTP codes and IP addresses to be used by our bubble sort algorithm.