Craig S. Mullins |
|
April 2001 | |
Why Data Still Matters Over
the years, many technologies and marketers have claimed that data has
become irrelevant and that some “new and improved” technology,
technique, or ideology will replace data as the center of IT and data
processing. But it has yet to happen, and it never will happen,
either! One
pretender to the throne was (dare I say, is) object oriented
technology. The object proponents claim that objects, because they
encapsulate both data and the processes that manipulate the data, are
superior to data. This is pure fantasy. OO development works well for
pre-planned, recurring workloads. When the developer can plan and
implement all of the
methods required for the object, the OO way of doing things will work
fine. But is this really anyone’s idea of reality? Many
workloads and requests are unplanned. Ad hoc queries, OLAP, and data
mining require access to data in ways that were not originally devised
when the data (or object) was first created. A proper database design,
created from a logical data model, and implemented using a relational
database enables ad hoc access to data using SQL. If instead you had
to rely on the “OO way” you would have to develop new methods for
the objects to look at the data encapsulated therein in a different
way. The performance of the new methods is likely to be poor if the
data within the object must be accessed in very different ways. But
the matter gets further complicated if your unplanned data analysis
requires data encapsulated in multiple objects. The OO way would
require multiple methods to be invoked by each object that contained
the required data and then some program to cobble the results
together. The beauty of relational databases using SQL is their
flexibility and suitability to perform unplanned data gathering and
analysis, quickly and with minimal effort. And without redesigning the
database or writing complex new programs (methods). Furthermore,
with today’s relational database technology, code can reside in
multiple places: procedural code on the database server in the form of
stored procedures and user-defined functions, within active database
rules in the form of triggers and constraints, and on multiple
application tiers. Rigid conformance to encapsulated methods within
objects imposes a strict development methodology that may cause more
harm than good. The argument made by OO proponents goes something like this: First of all, an object is a more natural model for representing the real world. By adhering to OO tenets the potential for reuse is high. Furthermore, OO properties such as inheritance and polymorphism provide additional flexibility for application development, enabling the programmer to modify objects to suit the circumstances. The overall goal is to use OO development techniques to create reusable components that can be combined together to build applications. The
argument is compelling, but misleading. To whom is an object “more
natural,” and as compared to what? It is very natural for people to
relate to most data the way a relational database does, as rows and
columns – we do it every day on business forms, checkbook registers,
spreadsheets, phone books, and so on. Furthermore, it is quite
possible to create reusable components without adhering to the OO
philosophy or using OO development techniques. CBD (component-based
development) proponents are doing this very well today without rigidly
conforming to OO development rules. Now,
I don’t want to totally and completely dismiss object-oriented
technology, but it is within the realm of programming that it provides
benefit. Any benefits OO can provide to DBMS technology have already
been incorporated into most DBMS products (that is, extensible data
types). For OO to achieve any long-term success, a practical OO
development environment that interoperates with and maps to relational
databases must be established. And it must not compromise good data
modeling and relational database design practices in doing so. The
bottom line is that data has intrinsic value and it should not be
forced to co-exist with program logic (methods) before it can be
stored. Neither should limitations be placed on data that would
require complex object-oriented program logic to be written for every
data access requirement. Data
is a corporate asset and should be treated as such. There are myriad
benefits to modeling data. Persistent data stored in relational
databases is an elegant and useful way to support the planned and
unplanned needs of your organization. And it will continue to be so
for many years to come. From Database
Trends, April 2001. |
|