Collections and Sorting

PHP has only a limited ability to support collections in the way that other programming languages such as C# and Java do, as far as the manner of access. This article navigates one possible solution.

Collections and Sorting

A collection is an object whose primary function is to store a number of like objects. An object called CarCollection may contain any number of Car objects. Collections can traditionally be accessed in the same manner as arrays, which means CarCollection[n] represents a particular Car object. This is true in C#, Java, and more – but not PHP, unfortunately. Since PHP has only recently begun to develop a package of built in objects (the SPL, Standard PHP Library), the ability to support collections in the accepted behavioral sense is very limited.

There is no way in PHP to create a custom indexer, which means if you want to use an object that allows direct access to elements via the Object[n] notation, you must inherit from the ArrayObject class. This means you are not able to change the internal storage method, which, in the case of ArrayObject, is a native array. Since there is no data structure in PHP other than the array, though, this implementation is acceptable; there isn’t any other alternative that would not eventually resolve to a native array.

“Wait,” you say, “what about sorting?” Using the native PHP array, there are an abundance of ways to sort. Since the solution we have discussed so far uses an array for internal data storage, this should be OK – but it is not. We are faced with two problems:

  1. There is no way to directly access the ArrayObject’s internal array.
  2. Assuming issue 1 is resolved, any given collection must know how to sort the object it contains – remember, native array sort functions only work with primitive types (strings and numbers).

Additionally there is the consideration of how we will sort. There are a number of available sorting algorithms, several of which are reasonably close in speed, several of which are archaic and nearly useless. We will examine some of the available choices later in this article and determine which is the best choice for our sorting needs.

{mospagebreak title=Weighing the Options}

To summarize, then, we can only use the Object[n] indexer notation by extending the built in ArrayObject class, but we can only implement sorting for non-primitive classes by using a custom class that knows how to work with them. The list of solutions is small and none of them are attractive:

  1. Use a custom collection class with no indexer functionality but with full sort functionality. This means that to access an element we must explicitly make a method call.
  2. Use an ArrayObject-derived class and create sorting methods that return a copy of the class with the elements ordered. This means that the collection class cannot sort itself.

Delightful, is it not? Personally I believe that the first solution is superior because the collection class will be able to do its own sorting. Using a Collection::Get(n) method results in extra typing, but it conveys clear intent and allows us to handle internal storage on our own. Using the second solution means we have to create either standalone functions or a sorting class for each collection class, in which we call Sort($CollectionObject). This approach is much less clear and requires a lot of additional typing, not only to use the classes but also to modify them or create new ones – and of course it requires that we remember more class names, more method names, and how to use them all in conjunction with one another.

Having determined that the first solution is better, this article will focus on implementing that solution in a clean and manageable way. Read on!

{mospagebreak title=Building the Foundation}

In order to effectively implement our solution for creating sortable object collections, we need to determine what common properties the collection classes will have as well as the requirements for the objects that will be contained in a collection. For now, let us define two interfaces – ICollection for the collection classes and ISortable for the objects that need to be sorted. You will see the meaning of the ISortable name shortly.

interface ICollection
{
     public function Get($i);
     public function Append($object);
     public function Sort();
}

For now we are only going to support the three methods defined in the above interface. Get() will return an element by index, in lieu of an indexer. Append() will add a new element to the collection. Sort() will, of course, sort the elements in the collection.

interface ISortable
{
     public function GetSortKey();
}

This interface will be implemented by the objects inside the collection that need to be sorted. The GetSortKey() method will return the “key value” of the object, similarly to the way ToString() works in .NET or Java.

Now we need to define a basic abstract base class for our collection classes.

abstract class Collection implements ICollection
{
     protected $data;  

     public function __construct()
     {
          $this->data = new ArrayObject();
     }    

     public function Get($i)
     {
          return $this->data[$i];
     }    

     public function Append($object)
     {
          $this->data->Append($object);
     }    

     public function Sort()
     {
     }
}

Here we have a basic implementation of the Collection class. It provides a default Get() and Append() method as well as a default constructor. Notice that the default implementation for the protected member $data is an ArrayObject. I chose this simply because I would prefer to be consistent and work with an object than the primitive type since both expose indexers. The only difference between using the ArrayObject and using an actual array is that we cannot directly access the inner array, which won’t be a problem.

{mospagebreak title=Concrete Classes}

Now it is time to create an actual concrete implementation of the abstract Collection class and the ISortable interface. We will use a class called People for our collection. It will contain objects of the type Person. Really, it can contain any object implementing ISortable, but in our example we will only use the class Person. For now, the Person class is going to be just a glorified container for a single string, $name.

class Person implements ISortable
{
     private $name;
     public function GetName() { return $name; }
     public function SetName($name) { $this->name = $name;     }
     public function GetSortKey()
     {
          return $this->name;
     }    

     public function __construct($name)
     {
          $this->name = $name;
     }
}

Next we need to implement the People collection and its Sort() method. As I mentioned earlier in the article, there are quite a lot of ways to sort. For this example I will use the “fundamental” sort algorithm – the Bubble Sort. I should also point out that it is the slowest sort and should never be used on large data sets or where performance is a critical concern.

class People extends Collection
{
     public function Sort()
     {
          for ($i = $this->data->count() – 1; $i >= 0; $i–)
          {
              $flipped = false;
              for ($j = 0; $j < $i; $j++)
              {
                   if (strcmp($this->data[$j]->GetSortKey(), 
                        $this->data[$j + 1]->GetSortKey()) > 0)
                   { 
                        $tmp = $this->data[$j];
                        $this->data->offsetSet($j, 
                             $this->data[$j + 1]);
                        $this->data->offsetSet($j + 1, $tmp);
                        $flipped = true;
                   }
              }
              if (!$flipped)
                   return;
          }
     }
}

The Bubble Sort is by far the most obvious and intuitive sort algorithm, performing in-place swaps in a linear fashion. Notice that it will return early if any pass does not cause a swap – which means in an example like this one, it’s just fine. Even if it did not return early, we will be working with so few elements that it simply does not matter. Consider the following example:

$people = new People();
$people->Append(new Person(‘Jamie’));
$people->Append(new Person(‘David’));
$people->Append(new Person(‘Natalie’));
$people->Append(new Person(‘Johnny’));

print ‘<pre>';

print_r($people);

$people->Sort();

print_r($people);

print ‘</pre>';

You should see this in your browser after running the script:

People Object
(
    [data:protected] => ArrayObject Object
        (
            [0] => Person Object
                (
                    [name:private] => Jamie
                )
            [1] => Person Object
                (
                    [name:private] => David
                )
            [2] => Person Object
                (
                    [name:private] => Natalie
                )
            [3] => Person Object
                (
                    [name:private] => Johnny
                )
        )
)
People Object
(
    [data:protected] => ArrayObject Object
        (
            [0] => Person Object
                (
                    [name:private] => David
                )
            [1] => Person Object
                (
                    [name:private] => Jamie
                )
            [2] => Person Object
                (
                    [name:private] => Johnny
                )
            [3] => Person Object
                (
                    [name:private] => Natalie
                )
        )
)

Conclusion

There it is, a minimal implementation of collections with sorting. We have defined two interfaces and an abstract collection class and created a uniform way to sort within our collection. By sacrificing the ability to use an indexer on our collection, we were able to encapsulate the sort functionality within the collection class itself, sparing ourselves the trouble of creating even more classes.

Having now implemented the worst possible sort algorithm and left an obvious re-factoring or two, the stage is set for the next installment of this article. In the next part we are going to actually implement a number of sort algorithms and benchmark them with a much larger data set. If you need to learn the fundamental sort algorithms (and all programmers do), be sure you check it out!

[gp-comments width="770" linklove="off" ]
antalya escort bayan antalya escort bayan