Craig S. Mullins |
|
Summer 1999 |
|
|
Mixing DB2 and Object Orientation?
With benefits like these it is no wonder that
object-oriented programming and development is being embraced by some IT
organizations. Historically, one of the biggest problems faced by IT is the
large project backlog that has accumulated. Some estimates show that more than
70% of the work done by IT organizations is maintenance, further exacerbating
the project backlog. In many cases end users are forced to wait for long periods
of time for their new applications because the backlog is so great and the
requisite talent needed to tackle so many new projects is not available.
Sometimes, this backlog can result in some unsavory phenomenon such as business
people attempting to build their own applications or purchasing of third party
packaged applications (and all of the potential administrative burdens that
packages carry). So, it is very clear why the siren song of object orientation
lures organizations. But what is object oriented technology and how does it
differ from the relational world of DB2? Let’s examine some of the key problem
areas. For definitions of the OO terminology used, please refer to the Sidebar
on OO terminology. OO technology is fundamentally based upon, what else, but
the object. Objects are defined based on object classes that determine the
structure (variables) and behavior (methods) for the object. So, it can be seen
that true objects can not be easily represented using a relational database. In
the RDBMS, a logical entity is transformed into a physical representation of
that entity solely in terms of its data characteristics. In DB2, you create a
table that can store the data elements (in an underlying VSAM data file
represented by a table space). The table contains rows that represent the
current state of that entity. The table does not store all of the encapsulated
logic necessary to act upon that data. By contrast, an object would define an
entity in terms of both its state and its behavior. In other words, an object
encapsulates both the data (state) and the valid procedures that can be
performed upon the object's data (behavior). Increasingly, RDBMSs are adding the capability to store
more logic in the database. With triggers and user-defined functions and stored
procedures, more behavior can be encapsulated in relational tables. Of course,
this is not the same type of encapsulation espoused by OO purists. One
difference is that user-defined functions and stored procedures are not limited
to acting upon data in only one table. Methods in OO parlance are created to
manipulate the state only of the object in which the method is encapsulated.
Triggers are closer to encapsulated methods in function, because they are
physically defined on a single table. However, triggers too are not limited in
the data that they can impact. For example, a trigger defined on TableA can
access data in TableB and TableC and modify data in TableD. Realize, too, that there is more to object oriented
technology than we have discussed, so far. This discussion has been purposely
simplified to introduce the notion of encapsulation. The definition as stated
does, however, introduce the basic difference between OO and relational
methodologies. A New Way of Thinking
Think for a moment in relational terms: many programs and
procedures are required to operate upon data in tables. Each procedure must
retrieve the data, operate upon it in some way, and possibly replace the data.
In the OO paradigm, messages are passed to objects invoking the
encapsulated methods. Because each object contains its own operations, or
methods, most of the procedural code is eliminated. To truly be an object-oriented programmer, you must learn
to see the world in a different way. A traditional programmer using a 3GL
against a relational database sees the world in terms of verbs. My program does
this, then executes that iteratively, then writes those, updates these and
exits. An OO programmer sees the world in terms of nouns. Object bank account increment your balance
by $153.29. It is a different way of thinking and programming. To further
demonstrate, consider the following code fragment: for
(shape in wind)
branchOn:
typeOf (shape)
circl:
drawCirc (shape)
rectang: drawRect
(shape)
traingl: drawTrian
(shape) This pseudo code represents the traditional, procedural way of coding a program to draw geometric shapes. Based on the type of shape, a different procedure is invoked to draw the correct shape. Now let’s look at the same code but using an object-oriented code fragment: wind forEach: shape
shape drawYourself You can see where this piece of code is simpler and easier to understand. The code is shorter, more comprehensive, and more stable. It need never change. When a new shape is introduced, the drawYourself method is part of the new object. This code continues to function. The code shown earlier would have to be changed for each new shape that is introduced. The biggest problem with this new way of thinking and
programming is that it is anathema to developing efficient relational databases
and applications. Fundamentally, object-oriented tenets state that the methods
that impact the state of the variables of an object are encapsulated within the
object. In many cases, organizations try to implement object-oriented
programming with a relational, DB2 back-end to store the persistent data. One of
the guiding principles of OO is that methods—the procedures that give objects
their behavior—should be coded against just one object. Translated, this means
that each SQL statement should access one and only one table. This might
simplify the design and development process. It also fosters adherence to the
principle of encapsulation. However, it is the wrong way to write relational
queries. A relational database has a relational optimizer. And DB2
has the best optimizer technology in the business. The relational optimizer
analyzes complex SQL and determines the most efficient way to access the data
based on the request, the database objects, and the environment. To best
optimize relational applications, you should code the data access using all of
the features and functionality available to you in SQL. This includes inner
joins, outer joins, unions, and subselects: all features that act upon multiple
tables and would not be allowed in most OO to DB2 implementation. If your
adherence to an OO philosophy, methodology, or language prohibits accessing more
than one table per SQL statement, you are building inefficient relational
applications. Synopsis
Object-oriented programming potentially offers some phenomenal benefits in terms of reduced development and maintenance time. However, if you are using DB2 for your databases, before implementing an OO methodology or language at your shop, be sure to analyze the impact of any trade-offs you must make causing you to forsake good relational programming and development techniques. Unless your application is simple or performance doesn’t matter (and when are either of these true), these trade-offs are probably detrimental to the performance of your DB2 applications.
Sidebar: OO Terminology Abstract
Data Type a data
type that is not defined into the programming language, but is defined by the
programmer; usually, ADTs are used to build high-level, complex structures
that model real world objects Behavior
refers to the way that an object functions and changes over time Class
a template for an object that defines the methods and variables for the
object; all objects of a specific class have the same structure and exhibit
the same behavior Class Hierarchy
tree structure that defines relationships among classes; each class
hierarchy has a single top node and potentially many nodes under the top node,
along different branches; a child class on the hierarchy inherits the parent
class’s variables and methods Encapsulation
the technique of combining data together with process in a single,
common area. This creates an
environment in which all of the operations for a given set of data are
organized and maintained in one place, thereby reducing confusion, eliminating
misuse, and simplifying maintenance. Inheritance
the mechanism whereby classes can make use of the structure and methods
defined in all classes above them in the class hierarchy Message
a request to an object to perform a method Method a process encapsulated within an object Object
a representation of a real world thing encapsulating the data and all
of its procedures (processes) within itself Polymorphism
the ability to send the same message to objects of different classes
and have each class perform a method in its own way State
the "make up" of an object at a given point in time; the
actual values stored in the variables
© 1999 Mullins Consulting, Inc. All rights reserved. |