RSS Feed

Peter Goodman's blog about PHP, Parsing Theory, C++, Functional Programming, Applications,

PHP4 Iterators Explained

NOTE to PHP5 Users: iterators are built in!

The common PHP application will generally have this layout:

  • Query the database
  • Loop over the database result set, and on each loop, add the result to a large array of data.
  • Loop over that large array of data and format it. (Most application will just skip this step and do it in the previous one)
  • Set the big formatted array of database information to the template
  • The template engine loops over the array and displays the appropriate thing for each row.

Woah. That's a lot of looping! Now, before I even tell you how iterators work, I'm going to show you how they can change your application. Here is how all of the above would work with a setup using iterators:

  • Query the database.
  • Use an iterator to deal with the database result set. On each iteration, a row will be taken from the database result.
  • Use another iterator on top of the database result set iterator to format the data. On each iteration, the row will be formatted.
  • Set the data formatting iterator, which has the database result set iterator in it, to the template.
  • The template loops over the iterator for the first time, in each iterator the result row is fetched from the database, then formatted, then displayed.

At this point you might be thinking, your list of how the iterator works is longer and more complicated. Not so! You might also not see the benefit of the iterator yet either, so let me explain them in more detail...

An iterator works on the principle of only getting a row (be it a row of an array or database result set, etc) when it is needed. Now that is a pretty ambiguous statement. You might be thinking: that's what my foreach / for loops do! Well, before I get into the code, let's look at how the functions of an iterator should work:

Functions

current: this function will return the current row. It is very important that you realize this. 'current' is passed the current row and it will return it. next: this function will tell the iterator to go to the next row. If there is another row, it will return TRUE, otherwise it will return FALSE. hasNext: this function is used in 'next'. It will return a boolean of whether or not another row exists. key: return the current index of the array/result set that we are on. reset: reset the iterator back to the start of the array / database result set.

Variables

_index

The index starts off at -1. This is VERY important because the way you use an interator is that you would do: if($it->next()) ... The 'next' function, as described above, will tell the iterator to go to the next index or not. It tells the iterator by using the 'hasNext' function and then by incrementing the '_index' variable. So, on the first iteration, '_index' will be incremented to 0 (zero) representing the first row of the array / result set.

_array

This is ONLY for an array iterator. This is a reference to the array that we are iterating through. _size: Again, this is ONLY for an array iterator. This is the count() or sizeof() (whichever you prefer) of the array. It is used to in 'hasNext' to see if we can still iterate through the array.

_it

This is ONLY for a proxy iterator (will be explained). This is a reference to the iterator that we will be iterating through.

Okay, read over all of that once more. That explains the inner workings of an iterator, but it still might not make the whole idea 'click' with you. So here's and example of an ArrayIterator in use:

// Create the array that we will want to iterate. 

$array = array(
    array('name' => 'Peter', 'age' => 18),
    array('name' => 'David', 'age' => 30),
);

$it = &new FAArrayIterator($array);

while($it->next())
{                       
    // get the current row in the array as $temp

    $temp = &$it->current();

    // output
    print_r($temp);
}

The comments are pretty straightforward, but to really understand what's going on, you should scroll back up and read the description of the functions again. Now, as described above, the 'next' function tells the iterator to go to the next row and will return a boolean value if the next row exists. So, here is a little table that describes what's going on:

status what '_index' becomes what 'hasNext' returned what 'next' returned what 'current' returned
instanciated-1
loop 10TRUETRUEarray('name' => 'Peter', 'age' => 18)
loop 21TRUETRUEarray('name' => 'David', 'age' => 30)
loop 31FALSEFALSE

You might be wondering why there's a loop 3. There isn't actually a loop 3, but 'next' is called three times. The reason is simply a result of how PHP's while() loop works. If the statement inside a while( statement in here ) loop returns TRUE, then it will continue to loop. If it returns FALSE, then it will stop. This all means that 'next' will be called, and when it returns FALSE, then the loop will stop.

On to the code! Here is the basic iterator class. All iterators will in one way or another extend this class. In PHP5, I think the iterator class is actually an interface or abstract class, but I'm not too sure about that.

class FAIterator {
    function &current() {
        assert(FALSE);
    }
    
    function hasNext() {
        assert(FALSE);
    }
    
    function key() {
        assert(FALSE);
    }
    
    function next() {
        assert(FALSE);
    }
    
    function reset() {
        assert(FALSE);
    }
}

Obviously, this class is never meant to be directly used, hence the assert(FALSE);. First I will show you what the code in the ArrayIterator looks like. After that, I will explain what a ProxyIterator is and does.

class FAArrayIterator extends FAIterator {
    var $_array;
    var $_index = -1;
    var $_size = 0;
    
    function FAArrayIterator(&$array) {
        assert(is_array($array));
        
        foreach ($array as $key => $value) {
            $this->_array[] = &$array[$key];
            $this->_size++;
        }
        
        $this->reset();
    }
    
    function &current() {
        if (!isset($this->_array[$this->_index])) {
            trigger_error("Array out of bounds", E_USER_ERROR);
        }
        
        return $this->_array[$this->key()];
    }
    
    function hasNext() {
        return ($this->_size > $this->_index + 1);
    }
    
    function key() {
        return $this->_index;
    }
    
    function next() {
        if ($ret = $this->hasNext()) {
            $this->_index++;
        }
        
        return $ret;
    }
    
    function reset() {
        $this->_index = -1;
        return TRUE;
    }
}

Now, there is a foreach statement in that array iterator's constructor, and you might be thinking: "Peter, you said that we would avoid extra loops!" Well, I was mainly talking about the use of iterators to get database result sets, which have no extra loops. Don't worry, I will show them too.

The array iterator appears simple enough and so I will continue on to the really cool iterator: the ProxyIterator. Simply put, the ProxyIterator iterates through an existing iterator. That's it! What makes the ProxyIterator so cool is that you can stack iterators that extend the proxy iterator. I will show an example later, but first the code:

class FAProxyIterator extends FAIterator {
    var $_it;

    function FAProxyIterator(&$it) {
        assert(is_a($it, 'FAIterator'));
        $this->_it = &$it;
    }

    function &current() {
        return $this->_it->current();
    }

    function hasNext() {
        return $this->_it->hasNext();
    }

    function key() {
        return $this->_it->key();
    }

    function next() {
        return $this->_it->next();
    }

    function reset() {
        return $this->_it->reset();
    }
}

Upon looking at the code, all the ProxyIterator does is calls the methods of the iterator passed to it!

Now, assume that we have an array of names and ages (same array as the above example). We want to make the names bold and make the ages italicized. So, let's use iterators! (NOTE: this is simply an example of stackable iterators. As a real world scenario, this is terrible usage of iterators)

class MakeNamesBold extends FAProxyIterator
{
    function MakeNamesBold(&$it)
    {
        parent::FAProxyIterator($it);
    }
    function &current()
    {

        $temp = &parent::current();

        $temp['name'] = '<strong>'. $temp['name] .'</strong>';

        return $temp;
    }
}

class MakeAgesItalic extends FAArrayIterator
{
    function MakeAgesItalic(&$it)
    {
        parent::FAArrayIterator($it);
    }
    function &current()
    {
        // get the current row from the parent iterator

        $temp = &parent::current();

        $temp['age'] = '<em>'. $temp['age] .'</em>';

        // return the formatted row

        return $temp;
    }
}

$array = array(
        array('name' => 'Peter', 'age' => 18),
        array('name' => 'David', 'age' => 30),
        ); 

$it = &new FAArrayIterator($array);
$it = &new MakeNamesBold($it);
$it = &new MakeAgesItalic($it); 

while($it->next())
{                       
    $temp = &$it->current();
    print_r($temp);
} 

First, we create the array and then pass it into the ArrayIterator. The we pass that iterator into the MakeNamesBold iterator. Finally, we pass that iterator into the MakeAgesItalic iterator. Now, what goes on in each loop of the while() is very interesting:

  • ArrayIterator::current is called. It returns the vanilla array with a person's name and age.
  • MakeNamesBold::current is called. It makes the person's name bold.
  • MakeAgesItalic::current is called. It makes the person's age italic.

At first glance it appears that the functions being called are in the wrong order. MakeAgesItalic is now the iterator being used in the while() loop, so its current function will be called first. MakeAgesItalic::current is called first and it calls it's parent::current, which refers to MakeNamesBold::current. MakeNamesBold::current calls it's parent::current, which refers to FAArrayIterator::current. This is the beauty of iterators: they can be stacked one on top of the other to incrementally and seamlessly modify the items to be looped over.

Throughout this article I have mentioned database result set iterators. Using an iterator to fetch a database result set is a honking great idea for a few reasons:

  • Rows are only fetched from the database when current() is called, not all at once before using them.
  • By encapsulating a database result set into an iterator it allows us to stack other iterators in top of it to format those results. As a result, the formatting is only performed when current() is called and thus only when it is needed.

The real power is with the ProxyIterator, but it requires that an iterator be passed to it. Sofar, the only other iterator you have is an ArrayIterator, so now I give you the MySQLResultIterator!

class MysqlResultIterator extends FAIterator {
    var $id;
    var $mode;
    var $row = -1;
    var $current;
    var $size;

    function MysqlResultIterator($id, $mode) {
        $this->id = $id;
        $this->mode = $mode;
        $this->size = mysql_num_rows($this->id);
    }

    function &current() {
        return $this->current;
    }

    function hasNext() {
        return ($this->row + 1 < $this->size) ? TRUE : FALSE;
    }

    function key() {
        return $this->row;
    }

    function next() {
        $ret = $this->hasNext();

        if ($ret) {
            $this->current = mysql_fetch_array($this->id, $this->mode);
            $this->row++;
        }

        return $ret;
    }

    function free() {
        return mysql_free_result($this->id);
    }

    function numRows() {
        return $this->size;
    }

    function reset() {
        if ($this->row >= 0)
            mysql_data_seek($this->id, 0);

        $this->row = -1;

        return TRUE;
    }
}

To this iterator, you pass the result of a mysql_query and the mode (e.g.: MYSQL_ASSOC), and then you can use it as if it were a normal iterator. You can stack it with other ProxyIterators, etc!

Where are iterators useful?

Iterators are useful in any situation where mapping functions onto a set of data is done.


Comments


Comment