A Practical Approach to Object-Relational Mapping in PHP
by Kieran Mathieson
PHP applications use various technologies, like HTML, relational databases (RDB), JavaScript, object-oriented programming (OOP), and so on. Most of them work well together. For instance, PHP code can easily write JavaScript, JavaScript can use a browser's document object model (DOM) to change HTML, etc.
OOP and RDB don't fit together quite as well. When programmers write code mapping PHP classes to RDB tables, they need to make sure the classes do not become too dependent on each other. Otherwise, changes in one class will require changes in others, raising development and maintenance costs. If software changes become too expensive and take too much time, applications will restrict business innovation, rather than enabling it.
This paper offers a practical solution to the object-relational problem in PHP. "Practical" means an approach meeting the constraints that usually apply in application development:
We'll use the term "application utility tasks" to refer to the things listed in the last bullet point. They often consume significant resources, sometimes more than the rest of the application. Carefully managing application utility tasks is important to application stability and performance. It's usually best to start building them in at the very beginning of a project, rather than try to add them later.
This paper assumes you're familiar with PHP, OOP, and relational databases. If you're not, you might want to consult these tutorials:
Background
Suppose your company breeds hippos for the army. Some are used as
beasts of burden. Others are equipped with weapons, like pie throwers, and spit
projectors. Your company doesn't install the weapons, just the sockets the
weapons plug into. Armed hippos can have more than one mount point. style="text-align: center;">id style="text-align: center;">integer style="text-align: center;">Primary keyname style="text-align: center;">text style="text-align: center;">
Now some PHP. For simplicity, methods not directly related to the topic we're discussing have been omitted.
<?php
class Hippo
{
var $id = null;
var $name = null;
//Load
record from DB
function load($id)
{
...
}
//Save
record to DB
function save()
{
...
}
}
?>
Hippo is a "business classes," that is, a class describing something meaningful to nontechnical people in the business. Some classes do not correspond to business entities. For example, classes for database objects, like Table, Record, and Field, might not mean anything to the average sales representative. However, he or she will understand what a Hippo is, and will know that they have id numbers and names.
Now let's write Hippo::load()
and Hippo::save()
. (We're using the
notation Hippo::load()
to refer to the load method of Hippo. In this
context, the notation has nothing to do with static methods.) Code for getting
and setting properties, error checking, etc., has been omitted for simplicity.
<?php
//Hippo methods
function load($id)
{
$this->id = $id;
$conn = mysql_connect('localhost', 'hippo_user', 'hippo_user');
mysql_select_db('hippos');
$result = mysql_query("select name from hippos where id=$id");
$this->name = mysql_result($result, 0, 'name');
application_utility_tasks();
}
function save()
{
$conn = mysql_connect('localhost', 'hippo_user', 'hippo_user');
mysql_select_db('hippos');
if ( is_null($this->id) )
{
//New
record
$query = "insert
into hippos (name) values ('".$this->name."')";
mysql_query($query);
$this->id = mysql_insert_id();
}
else
{
//Update
existing record
$query = "update
hippos set name='".$this->name."'
where id=".$this->id;
mysql_query($query);
}
application_utility_tasks();
}
?>
In Hippo:load()
, Line 3 copies the method's parameter into the id
property. Lines 4 to 7 open the database and retrieve the name of a hippo. Hippo::save()
uses the id property
to determine whether an object represents a new hippo, or one that is already
in the database (line 13). If the id is null, the method creates a new record.
Otherwise, it updates an existing one. The call to the function application_utility_tasks()
(lines 8 and 25) are
just to remind us that we need to do these things somewhere.
Let's create a subclass of Hippo called ArmedHippo
. Hippo has the
attributes id and name. ArmedHippo
inherits these attributes,
and adds one of its own: number of mount points.
<?php
class ArmedHippo extends Hippo
{
var $num_mount_points = null;
//Load
record from DB
function load($id)
{
...
}
//Save
record to DB
function save()
{
...
}
}
?>
The next step is to write ArmedHippo::load()
and ArmedHippo::save()
. It's in data
storage methods in subclasses like this that object-relational mapping becomes
a problem. If we're not cautious, we can make the OOP code hard to maintain.
Further, unless we carefully prepare the groundwork now, future application
updates will make things even worse.
We're going to look at several ways of doing the same three things:
ArmedHippo::load()
and ArmedHippo::save() style="text-align: center;">id style="text-align: center;">integer style="text-align: center;">Primary keyname style="text-align: center;">text style="text-align: center;"> style="text-align: center;">num_mount_points style="text-align: center;">integer style="text-align: center;">
Then we can cut-and-paste Hippo::load()
and Hippo::save()
into ArmedHippo
, and add the
num_mount_points field.
<?php
//ArmedHippo methods
function load($id) {
$this->id = $id;
$conn = mysql_connect('localhost', 'hippo_user', 'hippo_user');
mysql_select_db('hippos');
$result = mysql_query("select name,
num_mount_points from hippos where id=$id");
$this->name = mysql_result($result, 0, 'name');
$this->num_mount_points = mysql_result($result, 0, 'num_mount_points');
application_utility_tasks();
}
function save()
{
$conn = mysql_connect('localhost', 'hippo_user', 'hippo_user');
mysql_select_db('hippos');
if ( is_null($this->id) )
{
//New
record
$query = "insert
into hippos (name, num_mount_points)
values
('".$this->name."',
".
$this->num_mount_points.")";
mysql_query($query);
$this->id = mysql_insert_id();
}
else
{
//Update
existing record
$query = "update
hippos set name='".$this->name."',
num_mount_points=".$this->num_mount_points."
where
id=".$this->id;
mysql_query($query);
}
application_utility_tasks();
}
?>
The save() method looks at the id property on line 15. If id is null,
the method adds a new record, and gets the id number allocated to it by MySQL
on line 21. If id is not null, save() uses it an a primary key value in an SQL
UPDATE statement (line 25).
This will work, but loses some of the advantages of OOP. If we add a
weight property to Hippo, we'll also need to change ArmedHippo::load()
and ArmedHippo::save()
. If we add
StealthHippo, we'll introduce more code that needs to be changed as Hippo
changes. Further, all versions of the load()
and save()
methods call application_utility_tasks()
separately. If we
need to change the call, the code must be updated in every place.
Option 2: Extra Queries for New Fields
Another option is to add num_mount_points to the hippos table as above,
but change ArmedHippo
so it calls Hippo::load()
to handle the id and
name, and then deals with num_mount_points
itself. For example, we can
implement ArmedHippo::load()
as:
<?
function load($id) {
parent::load($id);
$conn = mysql_connect('localhost', 'hippo_user', 'hippo_user');
mysql_select_db('hippos');
$result = mysql_query("select num_mount_points from hippos where id=$id");
$this->num_mount_points = mysql_result($result, 0, 'num_mount_points');
application_utility_tasks();
}
?>
Line 2 calls Hippo::load()
to get id and name from the
database. The method then opens the database, executes an SQL statement, and
fetches num_mount_points.
How will this work when we add the weight property? Quite well, since
changes to the hippos table only affect Hippo's methods. However, this approach
reduces application performance. ArmedHippo::load()
calls Hippo::load()
, which open the
database and runs an SQL statement. Then ArmedHippo::load()
connects to the
database again, and runs another SQL statement. Add StealthHippo
, and things get
worse. Caching and connection pooling will reduce the performance hit, but not
eliminate it. Further, application_utility_tasks()
is still called all
over the place.
This illustrates a general problem with OOP. Classes hide
implementation details, including their use of resources. Sometimes this can
adversely affect system performance. And since you don't normally look at all
the code in existing classes when you extend them, you might not know why
performance has suddenly decreased.
Option 3: New Tables for Subclasses
A third option is to leave the hippos table as it is, and add a new
table for the id style="text-align: center;">integer style="text-align: center;">Primary keynum_mount_points style="text-align: center;">integer style="text-align: center;">
The id field is a foreign key into the hippos table. For example, a
record with an id of 25 in the hippos table will also have an id of 25 in the
new table.
This has the same disadvantages as option 2, and adds another all of
its own. Decision makers often want to explore data, looking for ways to
improve production, gain market share, reduce costs, etc. They want to use
friendly tools to query databases, like Access and Excel. Introducing extra
tables and links makes their task more difficult. It's not too bad for one
inherited class, but a real application might have a dozen of them.
Option 4: A New Base Class
Fortunately, there's a way out. Suppose we introduce a new class,
called BusinessBase
, and make it the base class of all business
classes. BusinessBase performs the application utility tasks (permissions,
transaction logging, etc.). It also handles every SQL statement. None of the
business classes ever connect to a database directly. Instead, they use a data
structure to exchange information with BusinessBase.
Here's the new Hippo:
<?php
class Hippo extends BusinessBase
{
var $id = null;
var $name = null;
//Constructor
function Hippo()
{
$this->table_name = 'hippos';
$this->addField('id', new DataField(DataField::TYPE_NUMERIC(), true) );
$this->addField('name', new DataField(DataField::TYPE_STRING(), false) );
}
//Load
record from DB
function load($id)
{
parent::load($id);
$this->id = $this->getDBValue('id');
$this->name = $this->getDBValue('name');
}
//Save
record to DB
function save() {
$this->setDBValue('id', $this->id );
$this->setDBValue('name', $this->name );
parent::save();
//Get id
number supplied by INSERT
$this->id = $this->getDBValue('id');
}
}
?>
The Hippo constructor creates the data structure that exchanges
information with BusinessBase. First, it sets the name of the RDB table used to
store hippo data (line 6). Then it specifies each field in the table, supplying
the name of the field, and a DataField object (lines 7 and 8). A DataField
object contains the data type of a field, and a flag showing whether or not it
is the primary key.
Hippo::load()
passes a record id to BusinessBase::load()
(line 12), then
extracts the values it needs (id and name in this case, in lines 13 and 14). Hippo::save()
stores the values it
wants to save (lines 18 and 19), and calls BusinessBase::save()
(line 20). If Hippo::save()
is creating a new
record, BusinessBase::save()
supplies its id number, which Hippo::save()
then retrieves (line
22).
The code for ArmedHippo is quite simple:
<?php
class ArmedHippo extends Hippo
{
var $num_mount_points;
//Constructor
function ArmedHippo()
{
parent::Hippo();
$this->addField('num_mount_points',
new DataField(DataField::TYPE_NUMERIC(), true) );
}
//Load
record from DB
function load($id)
{
parent::load($id);
$this->num_mount_points = $this->getDBValue('num_mount_points');
}
//Save
record to DB
function save()
{
$this->setDBValue('num_mount_points', $this->num_mount_points );
parent::save();
}
}
?>
The constructor calls Hippo's constructor (line 5), then adds a new
data field, num_mount_points
(line 6). ArmedHippo::load()
calls Hippo::load()
(line 11), then
extracts the value for the field it manages (line 12). ArmedHippo::save()
stores the value for
num_mount_points
(line 16), then
calls Hippo::save()
(line 17).
Now we've got what we wanted. ArmedHippo
can be added without
disturbing Hippo
. A weight property can be added to Hippo
, without requiring
changes to ArmedHippo
. StealthHippo
can inherit from ArmedHippo
, with no changes to
existing code. Further, no calls to functions that perform application utility
tasks appear in Hippo
, ArmedHippo
, or StealthHippo
. They are centralized in BusinessBase
.
BusinessBase starts like this:
<?php
class BusinessBase
{
var $table_name = null;
var $record = null;
var $_id_field = null;
}
?>
$table_name
is, of course, the name of the RDB table for the
class. $record
is an associative array describing each record in the table. The
array's index is the name of a field in the table (e. g., id). The value in each
array element is a DataField object:
<?php
class DataField
{
function TYPE_STRING()
{
return 1;
}
function TYPE_NUMERIC()
{
return 2;
}
var $db_type = null;
var $is_primary_key = null;
var $value = null;
function DataField($col_type, $is_pk)
{
$this->db_type = $col_type;
$this->is_primary_key = $is_pk;
}
}
?>
Lines 2 and 3 essentially define two constants, though they are really
static methods and are called as such (see lines 7 and 8 in the Hippo code
above, and line 7 in ArmedHippo). Declaring them this way subjects them to the
same scope and inheritance rules as other methods.
BusinessBase::addField()
(called by Hippo and ArmedHippo) adds a field
to $records
:
<?php
function addField( $db_col_name, $field )
{
$this->record[$db_col_name] = $field;
}
?>
BusinessBase::load()
looks like this:
<?php
function load($id)
{
$this->find_id_field();
$conn = mysql_connect('localhost', 'hippo_user', 'hippo_user');
mysql_select_db('hippos');
$query = 'select
* from '.$this->table_name.' where
'.$this->_id_field." = $id";
$fetched_record = mysql_fetch_assoc( mysql_query($query) );
foreach ($this->record as $col_name=>$field)
$this->record[$col_name]->value = $fetched_record[$col_name];
application_utility_tasks();
}
?>
BusinessBase::load()
calls find_id_field()
in line 3 to locate the
primary key field, and store its name in $this->_id_field
for later use. The
method then opens the database (line 4), builds a SELECT query (line 7), and
executes it (line 9). Lines 11-13 move the data into $record
, where it can be
retrieved with a statement like this from Hippo:
$this->id = $this->getDBValue('id');
You can download all of the sample code here.
Conclusion
This paper shows one way to handle object-relational mapping. The
technique has much to recommend it:
- Business classes
can be changed without disturbing each other.New classes can
be added without requiring changes in existing code.Data storage
methods in business classes (like Hippo::load()) are easy to write.Application
utility tasks (checking permissions, logging, etc.) are centralized for
easier management.The RDB retains
its natural structure, so it can be used easily with query generators,
data miners, etc.The BusinessBase
class is general. Developers can use the same class in many different
applications.
The code above is simplified, of course, to focus on the
object-relational problem. The code doesn't check for errors, like bad database
connections, or repeated field names (e. g., ArmedHippo
defining a field
that Hippo
has already
defined). Application utility tasks haven't been fully specified. Collection
classes (e. g., HippoHerd) haven't been discussed. Compound primary keys are
not handled. A database abstraction layer (e. g., ADODB) should be used.
Parameters like database user name and password should not be hard-coded. And
so on.
However, all of these problems can be addressed within the architecture
outlined here. BusinessBase::save()
could check for errors, log database changes, use
an RDB abstraction layer, and so on. Adding these things to BusinessBase
would yield a
robust, capable, reusable class that simplifies business application
development.