|
A DB2 for z/OS Performance Roadmap
By
Craig S. Mullins
Assuring
optimal performance is one of a database administrator's (DBA's) biggest ongoing
challenges. The loudest complaints about poor performance come from the users
who must wait longer than they are used to waiting for their applications to
respond. No one likes waiting, especially one who's unaccustomed to waiting.
But
what causes those once-fast applications to stall and deliver sub-par
performance? If there were an easy answer to that question, many DBAs would be
out of work. There's no short answer to this question, either. But, as a
starting point, let’s examine the fundamentals of performance management and
optimization and come up with a roadmap for tuning DB2 databases and
applications.
Every
database application, at its core, requires three components in order to
operate: the system, the database and the application. To deliver performance,
the DBA must be able to monitor and tune each of these components. This is
easier said than done.
Tuning the System
The
system
consists of the system software and hardware required for the application to
provide service. This includes the computer itself, its disk subsystems, network
connections and all peripherals. From a software perspective, the system
includes the operating system, the file system, the DBMS itself, networking
protocols and any related middleware such as transaction processors or message
queues.
To
deliver consistent system performance, the DBA must have the resources to
monitor, manage and optimize the performance of these disparate pieces of
hardware and software. Some of the tasks required for system tuning include the
proper allocation and management of memory structures (e.g., buffer pools,
program cache area, etc.), storage management, integration of the DBMS with
other system software, proper usage of database logs, and coordination of the
operating system resources used by the DBMS. Additionally, the DBA must control
the installation, configuration and migration of the DBMS software. If the
system isn't performing properly, everything that uses the system will perform
poorly. In other words, a poorly performing system impacts every database
application.
When managing DB2 system
performance, the DBA's first job is ensuring that all of the allied agent
address spaces are tuned and connected properly. This includes, but is not
limited to, CICS, IMS/TM, TSO, WebSphere, and DB2 Connect. If not configured and
tuned appropriately, all of these system software components interact with, and
can affect the performance of, DB2 applications.
For
example, when DB2 data is accessed using CICS, multiple threads can be active
simultaneously, giving multiple users concurrent access to a DB2 subsystem of a
single CICS region. A mechanism named the CICS Attach Facility connects CICS
with DB2. Using the CICS Attach Facility, you can connect each CICS region to
only one DB2 subsystem at a time. You can connect each DB2 subsystem, however,
to multiple CICS regions simultaneously. DB2 provides services to CICS via MVS
TCBs. All of these TCBs reside in the CICS address space and perform
cross-memory instructions to execute the SQL code in the DB2 database services
address space (DSNDBM1).
Furthermore,
the resource control table (RCT) must be configured for each DB2 program that
will run under CICS. The RCT applies only to CICS transactions that access DB2
data; it defines the manner in which DB2 resources will be used by CICS
transactions. In particular, the RCT defines a plan for each transaction that
can access DB2. Additionally, it defines parameters detailing the number and
type of threads available for application plans and the DB2 command processor.
DB2
DBAs also must ensure that appropriate DB2 system parameters are set using DB2
commands and DSNZPARMs. One of the most important areas for tuning here is
memory usage. DB2 uses memory for buffer pools, the EDM pool, RID pool and sort
pools to cache data and structures in memory. The better memory is allocated to
these structures, the better DB2 will perform.
When
allocating DB2 buffer pools, keep these rules of thumb in mind:
-
Don't
allocate everything to a single buffer pool (e.g., BP0); use a multiple
buffer pool strategy.
-
Explicitly
specify a buffer pool for every table space and index.
-
Isolate
the DB2 Catalog in BP0; put user and application DB2 objects into other
buffer pools.
-
Consider
separating indexes from table spaces with each in their own dedicated buffer
pools.
-
Consider
isolating heavily hit data into its own buffer pool to better control
performance.
-
Consider
isolating sorts into a single buffer pool and tuning for mostly sequential
access (e.g. BP7).
-
Consider
separating DB2 objects into separate buffer pools that have been configured
for sequential verses random access.
Forget
about trying to follow a cookie-cutter approach to buffer pool management. Every
shop must create and optimize a buffer pool strategy for its own data and
application mix. DB2 offers the following buffer pool tuning “knobs” that
can be used to configure virtual buffer pools to the type of processing they
support:
-
DWQT
–this value is the deferred write threshold; it is expressed as a
percentage of the virtual buffer pool that might be occupied by unavailable
pages. When this threshold is reached DB2 will start to schedule write I/Os
to externalize data. The default is 50%, which is likely to be too high for
most shops.
-
VDWQT
– this value is the vertical deferred write threshold; it is basically
the same as DWQT, but for individual data sets. The default is 10%, which
once again is quite likely to be too high for many shops.
-
VPSEQT
– this value is the sequential steal threshold; it is a expressed as a
percentage of the virtual buffer pool that can be occupied by sequentially
accessed pages. Tune buffer pools for sequential access such as scans and
sorting by modifying VPSEQT to a larger value. The default is 80%.
-
VPPSEQT
– this value is the sequential steal threshold for parallel
operations; the default value is 50%.
-
VPXPSEQT
– this value is assisting parallel sequential threshold; it is
basically the VPPSEQT for operations from another DB2 subsystem in the data
sharing group.
These
parameters can be changed using the ALTER BUFFERPOOL command. Additionally,
hiperpools can be created to back up DB2 virtual buffer pools with additional
memory. DB2 provides several tuning knobs for hiperpools, too, including HPSIZE
to adjust the size of hiperpools and HPSEQT to adjust the hiperpool sequential
steal threshold.
The
EDM pool is used for caching internal structures used by DB2 programs. This
includes DBDs, SKCTs, CTs, SKPTs, and PTs. It also includes the authorization
cache for plans and packages, as well as the cache for dynamic SQL mini-plans.
As a general rule, you should shoot for an 80 percent hit rate with the EDM
pool; this means that only one out every five times should a structure need to
be loaded from disk into the EDM pool.
Finally,
remember that buffer and EDM pool tuning are in-depth subjects that cannot be
adequately covered in a high-level article such as this. Additionally, there is
much more to proper DB2 system performance tuning than allied agent and memory
tuning. Other system elements requiring attention include locking, logging, and
Parallel Sysplex configuration and management for DB2 data-sharing shops.
Tuning
the Databases
The
second component of DB2 performance tuning is making sure the database
is optimally created and maintained. The database stores the data that is used
by the application. When the application needs to access data, it does so
through DB2 to the database of choice. If the database is not optimally
organized or stored, the data it contains will be difficult or slow to access.
The performance of every application that requires this data will be negatively
impacted.
The
first component of database optimization is assuring an optimal database design.
Database design is the process of transforming a logical data model into a
physical database design and then implementing the physical model as an actual
database. Proper database design requires up-front data modeling and
normalization – in other words, a logical data model is required before you
can even begin to design a physical database. Assuming a logical model exists,
it should be used as the basis for creating the physical database. The physical
database is created by transforming the logical data model into a physical
implementation based on an understanding of the DBMS to be used for deployment.
But a one-to-one mapping of logical entity to physical table is unlikely to
result in an optimal physical database design.
Successfully
developing a physical database design requires a good working knowledge of DB2's
features. More precisely, the DBA must possess in-depth knowledge of physical
database objects supported by DB2 as well as the physical structures and files
required to support those objects. This must include knowledge of the manner in
which DB2 supports indexing, referential integrity, constraints, data types and
other features that augment the functionality of database objects. Armed with
the correct information, the DBA can create an effective and efficient database
from a logical data model.
The
DBA creating the physical DB2 database implementation should keep these rules of
thumb in mind when building the databases:
-
Keep
the physical database as normalized as possible. However, performance should
win out over aesthetics. In other words, don’t let data modelers dictate
“physical” design. If you need to reorder columns in the table to
optimize logging, do it.
-
In
general, put each table in its own table space. Exceptions can be made for
very small tables such as code and reference tables.
-
In
general, favor partitioned and segmented table spaces over simple table
spaces. And don’t specify a DSSIZE greater than 4 GB unless you really
need a large table space (doing so will waste space).
-
Do
not create base table views.
-
Use
NULL sparingly.
-
Use
appropriate DB2 data types (e.g., use DATE for dates instead of CHAR or
numeric data types).
-
Consider
using DB2 compression instead of using VARCHAR columns. With compression,
there is less overhead and no programmatic handling is required.
-
Favor
using DB2 declarative RI instead of programmatic RI. It will usually perform
better and is easier to administer. However, it's generally a bad idea to
use RI for lookup and reference tables.
-
Avoid
the DDL defaults – they are usually wrong. Explicitly code every single
DDL option when creating DB2 objects.
-
Calculate
the appropriate amount of free space for both PCTFREE and FREEPAGE based on
the frequency of modification. Do not just default every table space to 10
percent free space. DB2 DBAs
should always keep in mind that a proper indexing strategy can be the
primary factor to ensure optimal performance of DB2 applications.
To
create appropriate indexes, the DBA must have the SQL that is to be used against
the DB2 objects in order to understand the access patterns. But many times the
DBA is called upon to create indexes well in advance of the creation of any SQL.
This can be a challenging dilemma – how to create indexes to optimize SQL
performance without any knowledge of the SQL. DBAs can develop some basic rules
of thumb though. These steps can be used as a basic roadmap for DB2 index
creation:
-
First,
take care of unique and primary key constraints by creating unique indexes
on those columns. Then consider creating indexes for each foreign key to
optimize RI.
-
The
next step is to examine the most heavily used queries. Look at the
predicates and build indexes to support these queries. Consider overloading
indexes by adding columns to encourage index only access.
-
Next,
examine queries that invoke sorts. Look for ORDER BY, GROUP BY, UNION, and
SELECT DISTINCT. Consider building indexes to support these clauses so DB2
can avoid initiating an expensive sort. Look at your indexes and make sure
you've chosen the first column wisely. In general, the first column of the
index should have a high cardinality.
-
In
general, consider avoiding indexes on variable columns because DB2 will
expand the VARCHAR column to its full length in the index. After coming up
with a “first stab” indexing strategy consider the insert, update, and
delete implications of those indexes. If the columns of those indexes must
be modified in any way, DB2 will incur additional overhead keeping the
indexes up-to-date.
Over
time, as data is modified and updated, DB2 will have to move the data around
within the database. Such activity causes the data to become fragmented and
inefficiently ordered. The longer the database remains online and the more
changes made to the data, the more inefficient database access can become. To
overcome disorganized and fragmented databases, the DBA can run a reorganization
utility to refresh the data and make the database efficient once again. But the
key to successful reorganization is to reorganize only
when the database requires it; instead, some companies over-reorganize by
scheduling regular database reorganization jobs to be run whether the database
is fragmented, or not. This wastes valuable CPU cycles.
But
reorganization is only one of many database performance tasks performed by the
DBA. Others include data set placement, partitioning for parallel access,
managing free space, and assuring optimal compression.
Tuning
the Applications
The
third, and final, component of database performance is the application itself. Indeed, as much as 80 percent of all
database performance problems are caused by inefficient application code. The
application code consists of two parts: the SQL code and the host language code
in which the SQL is embedded.
SQL
is simple to learn and easy to start using. But SQL tuning and optimization is
an art that takes years to master. Some general rules of thumb for creating
efficient SQL statements include:
-
Let
SQL do the work instead of the application program. For example, code an SQL
join instead of two cursors and a programmatic join.
-
Simpler
is generally better, but complex SQL can be very efficient.
-
Retrieve
only the columns required, never more.
-
Retrieve
the absolute minimum number of rows by specifying every WHERE clause that is
appropriate.
-
When
joining tables, always provide join predicates. In other words, avoid
Cartesian products.
-
Favor
using Stage 1 and Indexable predicates.
-
Avoid
sorting if possible by creating indexes for ORDER BY, GROUP BY, and DISTINCT
operations.
-
Avoid
black boxes – that is, avoid I/O routines that are called by programs
instead of using embedded SQL.
-
Avoid deadlocks by
updating tables in the same sequence in every program.
-
Issue
data modification statements (INSERT, UPDATE, DELETE) as close as possible
to the COMMIT statement as possible.
-
Be
sure to build a COMMIT strategy into every batch program that changes data.
Failing to COMMIT can cause locking problems.
Every
DBMS provides a method of inspecting the actual access paths that will be used
to satisfy SQL requests. The DBA must thoroughly understand the different types
of access paths, and know which ones are best in a given situation. For DB2, the
DBA must be able to interpret the output of the access path explanation produced
by EXPLAIN. This information is encoded in the PLAN_TABLE and must be
interpreted. To make matters more difficult, the PLAN_TABLE doesn't contain 100
percent of the information required to determine if the SQL will perform
efficiently. The DBA (and, indeed, programmers too) must be able to read the
PLAN_TABLE in conjunction with the SQL code, host language code, and information
from the DB2 catalog to judge the efficiency and effectiveness of each SQL
statement.
Host
language code refers to the application programs written in C, COBOL, Java,
Visual Basic or the programming language du jour. SQL statements are usually
embedded into host language code and it is quite possible to have finely tuned
SQL inside of inefficient host language code. And, of course, that would cause a
performance problem.
The Bottom Line
DBAs
must understand all three aspects of database performance management. The
system, the databases, and the applications all must be monitored and tuned to
assure efficient DB2 performance.
Furthermore,
the DBA must be able to identify performance problems as they occur. Once
identified, each problem must be analyzed to determine its cause. And only then
can a proper tuning strategy be deployed to rectify the problem. The guidelines
in this article can help, but remember – each performance management situation
is different and general rules of thumb may not apply.
This
article only touches the tip of the iceberg when it comes to DB2 performance
tuning and management. Database performance management is a complex and
ever-changing discipline that requires rigorous attention and focus to master.
And then, just when you seem to understand things, that new version of DB2 will
come out that changes everything! But this is good. It might make life more
difficult as you learn the latest DB2 nuances, but new releases invariably help
to make DB2 applications more efficient.
Good
luck with DB2 for z/OS and happy performance tuning!
From DB2 Update, December 2002.
© 2002 Craig S. Mullins, All rights reserved.
Home.
|