A Practical Approach to Object-Relational Mapping in PHP

by Kieran Mathieson

PHP applications use various technologies, like HTML, relational databases (RDB), JavaScript, object-oriented programming (OOP), and so on. Most of them work well together. For instance, PHP code can easily write JavaScript, JavaScript can use a browser's document object model (DOM) to change HTML, etc.

OOP and RDB don't fit together quite as well. When programmers write code mapping PHP classes to RDB tables, they need to make sure the classes do not become too dependent on each other. Otherwise, changes in one class will require changes in others, raising development and maintenance costs. If software changes become too expensive and take too much time, applications will restrict business innovation, rather than enabling it.

This paper offers a practical solution to the object-relational problem in PHP. "Practical" means an approach meeting the constraints that usually apply in application development:

We'll use the term "application utility tasks" to refer to the things listed in the last bullet point. They often consume significant resources, sometimes more than the rest of the application. Carefully managing application utility tasks is important to application stability and performance. It's usually best to start building them in at the very beginning of a project, rather than try to add them later.

This paper assumes you're familiar with PHP, OOP, and relational databases. If you're not, you might want to consult these tutorials:

Background

Suppose your company breeds hippos for the army. Some are used as beasts of burden. Others are equipped with weapons, like pie throwers, and spit projectors. Your company doesn't install the weapons, just the sockets the weapons plug into. Armed hippos can have more than one mount point. idintegerPrimary keynametext

Now some PHP. For simplicity, methods not directly related to the topic we're discussing have been omitted.

<?php
class Hippo
{
var $id = null;
var $name = null;
//Load record from DB
function load($id)
{
...
}
//Save record to DB
function save()
{
...
}
}
?>

Hippo is a "business classes," that is, a class describing something meaningful to nontechnical people in the business. Some classes do not correspond to business entities. For example, classes for database objects, like Table, Record, and Field, might not mean anything to the average sales representative. However, he or she will understand what a Hippo is, and will know that they have id numbers and names.

Now let's write Hippo::load() and Hippo::save(). (We're using the notation Hippo::load() to refer to the load method of Hippo. In this context, the notation has nothing to do with static methods.) Code for getting and setting properties, error checking, etc., has been omitted for simplicity.

<?php
//Hippo methods
function load($id)
{
$this->id = $id;
$conn = mysql_connect('localhost', 'hippo_user', 'hippo_user');
mysql_select_db('hippos');
$result = mysql_query("select name from hippos where id=$id");
$this->name = mysql_result($result, 0, 'name');
application_utility_tasks();
}
function save()
{
$conn = mysql_connect('localhost', 'hippo_user', 'hippo_user');
mysql_select_db('hippos');
if ( is_null($this->id) )
{
//New record
$query = "insert into hippos (name) values ('".$this->name."')";
mysql_query($query);
$this->id = mysql_insert_id();
}
else
{
//Update existing record
$query = "update hippos set name='".$this->name."' where id=".$this->id;
mysql_query($query);
}
application_utility_tasks();
}
?>

In Hippo:load(), Line 3 copies the method's parameter into the id property. Lines 4 to 7 open the database and retrieve the name of a hippo. Hippo::save() uses the id property to determine whether an object represents a new hippo, or one that is already in the database (line 13). If the id is null, the method creates a new record. Otherwise, it updates an existing one. The call to the function application_utility_tasks()(lines 8 and 25) are just to remind us that we need to do these things somewhere.

Let's create a subclass of Hippo called ArmedHippo. Hippo has the attributes id and name. ArmedHippo inherits these attributes, and adds one of its own: number of mount points.

<?php
class ArmedHippo extends Hippo
{
var $num_mount_points = null;
//Load record from DB
function load($id)
{
...
}
//Save record to DB
function save()
{
...
}
}
?>

The next step is to write ArmedHippo::load() and ArmedHippo::save(). It's in data storage methods in subclasses like this that object-relational mapping becomes a problem. If we're not cautious, we can make the OOP code hard to maintain. Further, unless we carefully prepare the groundwork now, future application updates will make things even worse.

We're going to look at several ways of doing the same three things:

  • Implementing ArmedHippo::load() and ArmedHippo::save()idintegerPrimary keynametext num_mount_pointsinteger

    Then we can cut-and-paste Hippo::load() and Hippo::save() into ArmedHippo, and add the num_mount_points field.

    <?php
    //ArmedHippo methods
    function load($id) {
    $this->id = $id;
    $conn = mysql_connect('localhost', 'hippo_user', 'hippo_user');
    mysql_select_db('hippos');
    $result = mysql_query("select name,
    num_mount_points from hippos where id=$id");
    $this->name = mysql_result($result, 0, 'name');
    $this->num_mount_points = mysql_result($result, 0, 'num_mount_points');
    application_utility_tasks();
    }
    function save()
    {
    $conn = mysql_connect('localhost', 'hippo_user', 'hippo_user');
    mysql_select_db('hippos');
    if ( is_null($this->id) )
    {
    //New record
    $query = "insert into hippos (name, num_mount_points)
    values ('".$this->name."', ".
    $this->num_mount_points.")";
    mysql_query($query);
    $this->id = mysql_insert_id();
    }
    else
    {
    //Update existing record
    $query = "update hippos set name='".$this->name."',
    num_mount_points=".$this->num_mount_points."
    where id=".$this->id;
    mysql_query($query);
    }
    application_utility_tasks();
    }
    ?>

    The save() method looks at the id property on line 15. If id is null, the method adds a new record, and gets the id number allocated to it by MySQL on line 21. If id is not null, save() uses it an a primary key value in an SQL UPDATE statement (line 25).

    This will work, but loses some of the advantages of OOP. If we add a weight property to Hippo, we'll also need to change ArmedHippo::load() and ArmedHippo::save(). If we add StealthHippo, we'll introduce more code that needs to be changed as Hippo changes. Further, all versions of the load() and save() methods call application_utility_tasks() separately. If we need to change the call, the code must be updated in every place.

    Option 2: Extra Queries for New Fields

    Another option is to add num_mount_points to the hippos table as above, but change ArmedHippo so it calls Hippo::load() to handle the id and name, and then deals with num_mount_points itself. For example, we can implement ArmedHippo::load() as:

    <?
    function load($id) {
    parent::load($id);
    $conn = mysql_connect('localhost', 'hippo_user', 'hippo_user');
    mysql_select_db('hippos');
    $result = mysql_query("select num_mount_points from hippos where id=$id");
    $this->num_mount_points = mysql_result($result, 0, 'num_mount_points');
    application_utility_tasks();
    }
    ?>

    Line 2 calls Hippo::load() to get id and name from the database. The method then opens the database, executes an SQL statement, and fetches num_mount_points.

    How will this work when we add the weight property? Quite well, since changes to the hippos table only affect Hippo's methods. However, this approach reduces application performance. ArmedHippo::load() calls Hippo::load(), which open the database and runs an SQL statement. Then ArmedHippo::load() connects to the database again, and runs another SQL statement. Add StealthHippo, and things get worse. Caching and connection pooling will reduce the performance hit, but not eliminate it. Further, application_utility_tasks() is still called all over the place.

    This illustrates a general problem with OOP. Classes hide implementation details, including their use of resources. Sometimes this can adversely affect system performance. And since you don't normally look at all the code in existing classes when you extend them, you might not know why performance has suddenly decreased.

    Option 3: New Tables for Subclasses

    A third option is to leave the hippos table as it is, and add a new table for the idintegerPrimary keynum_mount_pointsinteger

    The id field is a foreign key into the hippos table. For example, a record with an id of 25 in the hippos table will also have an id of 25 in the new table.

    This has the same disadvantages as option 2, and adds another all of its own. Decision makers often want to explore data, looking for ways to improve production, gain market share, reduce costs, etc. They want to use friendly tools to query databases, like Access and Excel. Introducing extra tables and links makes their task more difficult. It's not too bad for one inherited class, but a real application might have a dozen of them.

    Option 4: A New Base Class

    Fortunately, there's a way out. Suppose we introduce a new class, called BusinessBase, and make it the base class of all business classes. BusinessBase performs the application utility tasks (permissions, transaction logging, etc.). It also handles every SQL statement. None of the business classes ever connect to a database directly. Instead, they use a data structure to exchange information with BusinessBase.

    Here's the new Hippo:

    <?php
    class Hippo extends BusinessBase
    {
    var $id = null;
    var $name = null;
    //Constructor
    function Hippo()
    {
    $this->table_name = 'hippos';
    $this->addField('id', new DataField(DataField::TYPE_NUMERIC(), true) );
    $this->addField('name', new DataField(DataField::TYPE_STRING(), false) );
    }
    //Load record from DB
    function load($id)
    {
    parent::load($id);
    $this->id = $this->getDBValue('id');
    $this->name = $this->getDBValue('name');
    }
    //Save record to DB
    function save() {
    $this->setDBValue('id', $this->id );
    $this->setDBValue('name', $this->name );
    parent::save();
    //Get id number supplied by INSERT
    $this->id = $this->getDBValue('id');
    }
    }
    ?>

    The Hippo constructor creates the data structure that exchanges information with BusinessBase. First, it sets the name of the RDB table used to store hippo data (line 6). Then it specifies each field in the table, supplying the name of the field, and a DataField object (lines 7 and 8). A DataField object contains the data type of a field, and a flag showing whether or not it is the primary key.

    Hippo::load() passes a record id to BusinessBase::load() (line 12), then extracts the values it needs (id and name in this case, in lines 13 and 14). Hippo::save() stores the values it wants to save (lines 18 and 19), and calls BusinessBase::save() (line 20). If Hippo::save() is creating a new record, BusinessBase::save() supplies its id number, which Hippo::save() then retrieves (line 22).

    The code for ArmedHippo is quite simple:

    <?php
    class ArmedHippo extends Hippo
    {
    var $num_mount_points;
    //Constructor
    function ArmedHippo()
    {
    parent::Hippo();
    $this->addField('num_mount_points',
    new DataField(DataField::TYPE_NUMERIC(), true) );
    }
    //Load record from DB
    function load($id)
    {
    parent::load($id);
    $this->num_mount_points = $this->getDBValue('num_mount_points');
    }
    //Save record to DB
    function save()
    {
    $this->setDBValue('num_mount_points', $this->num_mount_points );
    parent::save();
    }
    }
    ?>

    The constructor calls Hippo's constructor (line 5), then adds a new data field, num_mount_points (line 6). ArmedHippo::load() calls Hippo::load() (line 11), then extracts the value for the field it manages (line 12). ArmedHippo::save() stores the value for num_mount_points (line 16), then calls Hippo::save() (line 17).

    Now we've got what we wanted. ArmedHippo can be added without disturbing Hippo. A weight property can be added to Hippo, without requiring changes to ArmedHippo. StealthHippo can inherit from ArmedHippo, with no changes to existing code. Further, no calls to functions that perform application utility tasks appear in Hippo, ArmedHippo, or StealthHippo. They are centralized in BusinessBase.

    BusinessBase starts like this:

    <?php
    class BusinessBase
    {
    var $table_name = null;
    var $record = null;
    var $_id_field = null;
    }
    ?>

    $table_name is, of course, the name of the RDB table for the class. $record is an associative array describing each record in the table. The array's index is the name of a field in the table (e. g., id). The value in each array element is a DataField object:

    <?php
    class DataField
    {
    function TYPE_STRING()
    {
    return 1;
    }
    function TYPE_NUMERIC()
    {
    return 2;
    }
    var $db_type = null;
    var $is_primary_key = null;
    var $value = null;
    function DataField($col_type, $is_pk)
    {
    $this->db_type = $col_type;
    $this->is_primary_key = $is_pk;
    }
    }
    ?>

    Lines 2 and 3 essentially define two constants, though they are really static methods and are called as such (see lines 7 and 8 in the Hippo code above, and line 7 in ArmedHippo). Declaring them this way subjects them to the same scope and inheritance rules as other methods.

    BusinessBase::addField() (called by Hippo and ArmedHippo) adds a field to $records:

    <?php
    function addField( $db_col_name, $field )
    {
    $this->record[$db_col_name] = $field;
    }
    ?>

    BusinessBase::load() looks like this:

    <?php
    function load($id)
    {
    $this->find_id_field();
    $conn = mysql_connect('localhost', 'hippo_user', 'hippo_user');
    mysql_select_db('hippos');
    $query = 'select * from '.$this->table_name.' where '.$this->_id_field." = $id";
    $fetched_record = mysql_fetch_assoc( mysql_query($query) );
    foreach ($this->record as $col_name=>$field)
    $this->record[$col_name]->value = $fetched_record[$col_name];
    application_utility_tasks();
    }
    ?>

    BusinessBase::load() calls find_id_field() in line 3 to locate the primary key field, and store its name in $this->_id_field for later use. The method then opens the database (line 4), builds a SELECT query (line 7), and executes it (line 9). Lines 11-13 move the data into $record, where it can be retrieved with a statement like this from Hippo:

    $this->id = $this->getDBValue('id');

    You can download all of the sample code here.

    Conclusion

    This paper shows one way to handle object-relational mapping. The technique has much to recommend it:

    • Business classes can be changed without disturbing each other.New classes can be added without requiring changes in existing code.Data storage methods in business classes (like Hippo::load()) are easy to write.Application utility tasks (checking permissions, logging, etc.) are centralized for easier management.The RDB retains its natural structure, so it can be used easily with query generators, data miners, etc.The BusinessBase class is general. Developers can use the same class in many different applications.

    The code above is simplified, of course, to focus on the object-relational problem. The code doesn't check for errors, like bad database connections, or repeated field names (e. g., ArmedHippo defining a field that Hippo has already defined). Application utility tasks haven't been fully specified. Collection classes (e. g., HippoHerd) haven't been discussed. Compound primary keys are not handled. A database abstraction layer (e. g., ADODB) should be used. Parameters like database user name and password should not be hard-coded. And so on.

    However, all of these problems can be addressed within the architecture outlined here. BusinessBase::save() could check for errors, log database changes, use an RDB abstraction layer, and so on. Adding these things to BusinessBase would yield a robust, capable, reusable class that simplifies business application development.