A New Day for the Database
By
Craig S. Mullins
Challenges include increased
complexity and the need to integrate Web and wireless functionality
Database management systems (DBMSs)
are the heart of the modern IT infrastructure. Enterprises expect the DBMS to be
reliable, secure and available. In short, the DBMS is what makes modern
applications sing.
Modern database systems, however, are
becoming increasingly large and complex. These days, databases store not only
characters, numbers and dates, but complex unstructured data types like audio,
video, images and compound documents. Then there's the rapid and sometimes
out-of-control growth of data warehouses spurred by the business intelligence
needs of many enterprises.
Besides housing data, databases store
processes that act upon that data. Stored procedures, triggers and user-defined
functions managed by the DBMS place new requirements on database administrators
(DBAs). In addition, more than ever before, databases are being placed on more
diverse platforms -- from mainframes to workstations, and even personal digital
assistants (PDAs). Finally, databases are increasingly being connected to the
Internet to enable e-business applications. All of these trends and new
technologies further complicate database management and impact the jobs of DBAs.
The Internet and Databases
In the world of e-business, full
functionality and non-stop availability are expected. Regardless of the time or
date, customers expect an e-business to be up and running, available to serve
their needs. Why should customers wait for your site to come back up when
numerous competitors are available just a mouse-click away?
If an e-business isn't prepared to
serve customers 24 hours a day, it risks losing business to more accessible
competitors. System outages, whether planned (for maintenance and tuning) or
unplanned (due to hardware failure, bugs or viruses), are the enemy of the
successful e-business.
Many administrative tasks for
Web-enabled databases differ from traditional DBA tasks. Let’s refer to these
expanded DBA duties for e-business as eDBA.
The eDBA's first duty is minimizing
downtime. Analysts estimate that as much as 80 percent of application downtime
is due to software failures and human error. Downtime is caused by problems such
as improperly entered transactions, improperly timed batch runs or running a
program using the wrong input files or parameters. DBAs can reverse the effects
of application failures using the database log and high-speed transaction
recovery solutions, which enable recovery from application failures without
taking a database outage at all – thereby bolstering the database for 24x7
e-business.
For the eDBA, the concept of downtime
requires a shift in thinking. Though it's important to plan for recovery from
unplanned outages, remember that planned outages occur more frequently and
therefore can have a greater impact on overall availability than unplanned
outages. Examples of planned outages include software upgrades, database changes
and regularly scheduled maintenance tasks like reorganization.
Of course, the best way to reduce
downtime is to avoid it. Modern database systems are gaining features to help
avoid downtime altogether by enabling change, optimization and management tasks
to be performed while databases remain online. One example is concurrent
database reorganization. Utilities can reorganize data to a mirror copy,
swapping the copies when the reorganization process completes. Because the
database can stay online during the process, downtime is eliminated. These
techniques require more disk space, but won't disrupt an online business.
Emerging technologies are becoming available to perform other tasks such as
backing up and loading databases or changing system parameters while the
databases remain online for read and update.
Another way to minimize downtime is to
automate routine maintenance tasks. For example, changing the structure of a
table can be an arduous task. The structure of relational databases can be
modified using the ALTER statement, but ALTER cannot be used for every type of
change. Database change-management tools allow you to make desired changes to a
relational database using an online interface. The tool automatically generates
scripts that understand the correct way to invoke database changes. When errors
are avoided using automation, downtime is diminished, resulting in greater
online availability.
Sometimes downtime is unavoidable. In
these cases, the DBMS tries to minimize downtime by staying current with DBMS
versions that feature better-optimized code and provide new data-management
techniques.
Java and XML
While downtime is the most important
issue faced when coupling databases to the Internet, eDBAs also need to be
prepared to work with key Internet-related technologies like Java* and XML.
As applications are moved to the Web,
Java gains in popularity. Two methods of accessing relational data from a Java
program are JDBC and embedded static SQL for Java (SQLJ). JDBC is an API that
enables Java to access relational databases. Similar to ODBC, JDBC consists of a
set of classes and interfaces that can be used to access relational data. JDBC
provides dynamic SQL access to relational databases. Using JDBC, theoretically
at least, you should be able to write an application for one platform, say
Oracle9i on a Sun* Solaris* platform, and then deploy it on another platform,
say DB2* UBD on a zSeries server. Simply by using the correct JDBC drivers for
the database platform, the application should be portable. With SQLJ, a
translator must process the Java program. For the DB2 literate, this is just
like precompiling a COBOL program. The translator strips the SQL from the Java
code so it can be optimized into a database request module. It also adds Java
code to the Java program, replacing the SQL calls. Now the entire program can be
compiled into bytecodes, and a bind can be run to create a package for the SQL.
So, should you use JDBC or SQLJ? Of
course, the answer is "it depends!" SQLJ potentially can enhance performance
using static SQL, which can be important for Java. SQLJ is similar to embedded
SQL programs; JDBC is similar to call-level interface programs. The familiarity
of your developers with either approach could make one method the best choice.
So, before your shop moves forward with Java development, be sure that your DBAs
have been trained in Java and understand the differences between JDBC and SQLJ.
Another significant trend in the
database market is XML. Like HTML, XML is based on standard generalized markup
language (SGML), which allows documents to be self-describing through the
specification of tag sets and the structural relationships between the tags.
HTML is a small, specifically defined set of tags and attributes, enabling users
to bypass the self-describing aspect for a document. XML, on the other hand,
retains the key SGML advantage of self-description, while avoiding the
complexity of full-blown SGML.
So what? Well, XML allows users to
define tags that describe the data in the document. With this capability, users
can describe the structure and nature of the data in the document. In essence,
the document becomes self-describing. XML's simple syntax is intended to make it
easy to process by machine while remaining understandable to users. In short,
because XML documents are self-describing, the data is easily understood and
used by the receiver. Thus, it's increasingly being used to transfer data.
A typical use of XML is to enable the
DBMS to parse XML and move it in and out of an SQL database. Another trend is
native XML database systems that operate on XML data without being transformed
into a relational structure. XQuery support is being added to database systems
to enable users to query native XML within the database.
Procedural DBA
Until recently, DBMSs solely stored,
managed and accessed data. Although these core capabilities are still required,
modern DBMS products also integrate procedural logic like triggers, stored
procedures and user-defined functions (UDFs).
Procedural database logic offers the
primary benefit of promoting reusability. Instead of replicating code within
multiple application programs, code can reside in a single place: the database
server. This is preferable to cannibalizing sections of program code for each
new application that must be developed. An additional benefit is increased
consistency. If every user and every database activity with the same
requirements is assured of using the database logic instead of multiple,
replicated code segments, then the organization can be assured that everyone is
running consistent code.
Although the functionality provided by
triggers, stored procedures and UDFs is unquestionably useful, these objects
pose major administration challenges. DBAs must administer, design, manage and
sometimes even code these objects, even though code testing and debugging isn't
a typical role for DBAs. Database experts cannot be expected to debug code
written in C or COBOL. Companies may rely on DBAs to be the SQL experts in the
company, but often they're not experts – at least not “programming” experts.
Simply because the DBA knows the best way to create a physical database design
doesn't mean he'll know how to best access that data.
A new type of DBA – a procedural DBA –
is required to accommodate procedural logic administration. Procedural DBAs
require both database and programming skills. They should be responsible for
database-related programming activities, with administering stored procedures,
triggers and UDFs among their primary responsibilities. Whether these objects
are actually programmed by the procedural DBA will depend on the size of the
shop, the number of DBAs available and the scope of implementation. Minimally,
the procedural DBA should lead code reviews and manage the use (and reuse) of
database procedural logic. Additionally, the procedural DBA must be on-call in
the event of a stored procedure, trigger or UDF failure.
Other procedural administrative
functions can be allocated to the procedural DBA, including application code
reviews, access path review and analysis, SQL debugging and complex SQL
analysis. These are areas in which many DBAs are inadequately trained. It's a
distinctly different skill to program than it is to create well-designed
relational databases. Yet, DBAs quickly learn that they must be able to
understand efficient application programming techniques. Offloading some of
these tasks to the procedural DBA will enable the traditional, data-oriented
DBAs to concentrate on the actual physical design and implementation of
databases. This should result in much better designed databases and better
overall performance.
The PDA DBA
Mobile workers are here to stay. And
DBAs must be ready to support them with a valid, shared data infrastructure.
While a PDA database shouldn't require
the in-depth tuning and administration required of enterprise databases, the DBA
must be involved in the initial design of PDA databases to ensure they're
optimal for the handheld environment. But design is a minimal concern. The big
impact is in planning for and managing the data synchronization from hundreds or
thousands of PDAs. When should synchronization be scheduled? How will it impact
applications that use large production databases that are involved in the
synchronization? How can you ensure that a mobile user will synchronize his data
reliably and on schedule?
These aren't minor issues. Before
enabling a large contingent of PDA database users, the DBA staff must be
prepared for the impact to enterprise databases. The primary impact will be
analyzing and understanding data synchronization technology, as well as
evaluating the need for remote database users within the organization. Which
application systems will be impacted first? Typically those with remote workers
such as sales systems or delivery tracking are likely to be first. Take some
time to review the data requirements of those applications and how a large
influx of remote connections might impact the current systems.
The major DBMS vendors offer small
footprint versions of their flagship products to run on PDAs and other handheld
devices. (IBM's offering, for example, is DB2 Everyplace.) They're geared for
storing a small amount of critical data that's later synchronized to long-term
data stores.
The Speed of Change
A final overriding database trend is
the overall speed of change. Most organizations cannot implement new DBMS
versions as fast as the vendors deliver them. DBMS vendors are on a 12- to
24-month cycle for releasing major new version of their products. And in between
there are a bevy of PTFs and fixes. Keeping up-to-date is a major struggle.
Indeed, many users are still running
old versions of their DBMS because of the migration and testing required to move
from one version to the next. This is so even when the vendors make an effort to
simplify the process. Programs need to be run to ensure that performance is
similar for the new release, sometimes data needs to be migrated to a new
format, and older features are increasingly being dropped from new releases,
which can require reprogramming functional programs and databases.
The Role of Automation
Database management is getting more
complex which in turn increases the skill-level required to successfully
administer databases. Today’s DBA must be skilled not just in database design
and tuning, but in Internet/Web technology, procedural logic, complex data
types, pervasive computing, change management and other areas.
Indeed, a successful DBA must possess
more skills than most technicians can master. However, intelligent automation
can help address this problem. IT professionals spend their career delivering
systems that automate everyone else’s job, but we have yet to intelligently
automate most of our DBA tasks. Automating some of our tedious day-to-day tasks
can free up time to learn about new DBMS features – and allow us to implement
them appropriately.
But simple automation is not
sufficient. The software should be able to intelligently monitor, analyze and
optimize applications using past, present and future analysis of collected data.
Simply stated, the software should work the way a consultant works – fulfilling
the role of a trusted advisor.
By deploying intelligent automation,
the underlying complexity of the database environment can be somewhat
simplified, enabling the DBA to successfully manage the modern database
environment. In this day and age, anything less is a recipe for failure.
From eServer Magazine, Mainframe Edition,
September 2002.
© 2002 Craig S. Mullins, All rights reserved.
Home.
|