When speaking about distributed DB2 data, it is necessary to speak about DRDA. DRDA stands for Distributed Relational Database Architecture. It is an architecture, developed by IBM, that enables relational data to be distributed among multiple platforms. Both like and unlike platforms can communicate with one another. For example, one DB2 subsystem can communicate to another DB2 subsystem (like). Alternately, a DB2 subsystem can communicate with a third-party RDBMS (unlike). The platforms need not be the same. As long as they both conform to the DRDA specifications, they can communicate. DRDA can be considered a sort of “universal, distributed data protocol.”
This article will describe DRDA. Keep in mind that no vendor, not even IBM, has implemented a RDBMS that fully supports all DRDA functionality.
What is DRDA?
DRDA is a set of protocols, or rules, that enable a user to access distributed data regardless of where it physically resides. It provides an open, robust heterogeneous distributed database environment. DRDA provides methods of coordinating communication among distributed locations. This allows applications to access multiple remote tables at various locations and have them appear to the end-user as if they were a logical whole.
A distinction should be made, however, between the architecture and the implementation. DRDA describes the architecture for distributed data and nothing more. It defines the rules for accessing the distributed data, but it does not provide the actual application programming interfaces (APIs) to perform the access. So DRDA is not an actual program, but is more like the specifications for a program.
When a DBMS is said to be DRDA-compliant, all that is implied is that it follows the DRDA specifications. DB2 is a DRDA-compliant RDBMS product.
Benefits of DRDA
DRDA is only one protocol for supporting distributed RDBMS. Of course, if you are a DB2 user, it is probably the only one that matters.
The biggest benefit provided by DRDA is its clearly stated set of rules for supporting distributed data access. Any product that follows these rules can seamlessly integrate with any other DRDA-compliant product. Furthermore, DRDA-compliant RDBMSes support full data distribution including multi-site update. The biggest advantage, however, is that it is available today, and many vendors are jumping on the DRDA-compliance bandwagon.
An alternative to DRDA is to utilize a gateway product to access distributed data. Gateways are comprised of at least two components: one for each distributed location. These parts communicate with one another. As far as DB2 is concerned, a host-based gateway component is necessary. It functions as another mainframe DB2 application. Most gateway products that access DB2 execute using CICS (and sometimes VTAM). Gateways, however, typically support dynamic SQL only.
Therefore, two more advantages of DRDA surface in the performance arena:
What About RDA?
Although DRDA is the distributed architecture utilized by DB2, it is not the only architecture in the industry. RDA (Remote Database Access) is a competing set of protocols developed by the ISO and ANSI standard committees.
As a DB2 developer, DRDA will be the method you use to implement distributed data with DB2. However, knowing a bit about RDA can not hurt.
-
RDA was built to work with a standard subset of SQL that is available from DBMS to DBMS. DRDA was built to function with platform-specific extension to SQL.
-
Static SQL can be used with DRDA; with RDA only dynamic SQL is available.
DRDA Functions
Three functions are utilized by DRDA to provide distributed relational data access:
These three functions inter-operate with on another to enable distributed access. Let’s further examine these three functions.
Application Requester
The DRDA application requester (AR) function enables SQL and program preparation requests to be requested by application programs. The AR accepts SQL requests from an application and sends them to the appropriate application server (or servers) for subsequent processing. Using this function, application programs can access remote data.
In theory, if all of the data that you are interested in is physically located somewhere else (i.e., remote), there may be no need for a local RDBMS, and DRDA does not require the requester to run on a system with a local RDBMS.
For the DB2 Family, the DRDA AR function is implemented using DB2 Connect.
Application Server
The DRDA application server (AS) function receives requests from application requesters and processes them. These requests can be either SQL statements or program-preparation requests. The AS acts upon the portions that can be processed and forwards the remainder to DRDA database servers for subsequent processing. This is necessary if the local RDBMS cannot process the request.
The AR is connected to the AS using a communication protocol called the Application Support Protocol. The Application Support Protocol is responsible for providing the appropriate level of data conversion. This is only necessary when different data representations are involved in the request. An example of this is the conversion of ASCII characters to EBCDIC (or vice versa).
Database Server
The DRDA database server (DS) function receives requests from application servers or other database servers. These requests can be either SQL statements or program preparation requests. Like the application server, the database server will process what it can and forward the remainder on to another database server.
It is important to note that a database server request may be for a component of an SQL statement. This would occur when data is distributed across two subsystems and a join is requested. The join statement is requesting data from tables at two different locations. As such, one portion must be processed at one location; the other portion at a different location.
Because the database servers involved in a distributed request need not be the same, the Database Support Protocol is used. It exists for the following reasons:
Like the Application Support Protocol, the Database Support Protocol is used to ensure compatibility of requests between different database servers.
What is Returned
When a request is completely processed, the application server must inform the requesting process, the application requester. How is this accomplished?
The AS passes a return code and a result set (if one was produced) back to the AR. The return code is the SQLSTATE (or SQLCODE). A result set is not generated under the following circumstances:
This protocol is used unless a cursor is employed. When rows are fetched from a read-only cursor limited block protocol can be used. Limited block protocol passes multiple rows across the network at a time, even though one fetch can process only a single row at a time. Limited block protocol enhances overall performance by minimizing network traffic. If the cursor is not read-only (that is, rows can be updated) limited block protocol is not employed.
DRDA Architectures and Standards
In order for DRDA to exist, it relies upon other established protocols. These architectures are examined in the following sections.
Advanced Program to Program Communication (APPC)
Advanced Program-to-Program Communication provides peer-level communication support based on LU 6.2 protocols. LU 6.2 is an advanced communication architecture that defines the formats and protocols for communication between functionally equivalent logical units.
APPC/LU 6.2 provides communication and transaction processing facilities needed for cooperative processing and distributed transaction processing.
Distributed Data Management (DDM)
The Distributed Data Management architecture defines facilities for accessing distributed data across a network using APPC and LU 6.2. With DDM, the distributed data to be accessed can reside in either files or relational databases. An RDBMS is implied, however, within the context of DRDA.
Formatted Data: Object Content Architecture (FD:OCA)
FD:OCA is an architecture that provides for the distribution and exchange of field-formatted data. Using FD:OCA, both the data and its description are packaged together so that any DRDA-compliant DBMS can understand its structure and content.
Character Data Representation Architecture (CDRA)
Character Data Representation Architecture is the architecture utilized to ensure that any symbol or character used on any SAA relational DBMS has the same meaning regardless of the underlying coded character set. CDRA provides a method of unambiguously identifying data from any SAA platform.
CDRA is necessary particularly when data is transferred between a PC workstation (using ASCII code) and a mainframe (using EBCDIC code). Theoretically, CDRA can be extended to support other codes (such as Unicode, a new character encoding scheme gaining support).
The Five DRDA Levels
There are five levels within DRDA. Each level represents an increasing level of distributed support. Additionally, the levels reflect:
In order of increasing complexity, the five DRDA levels are:
-
User-Assisted Distribution
-
Remote Request
-
Remote Unit of Work (RUW)
-
Distributed Unit of Work (DUW)
-
Distributed Request
Table 1 offers a synopsis of the DRDA levels.
Table 1. The Five DRDA Levels
DRDA Level |
SQL stmts
per UOW |
DBMSs
per UOW |
DBMS per
SQL stmt |
User-Assisted |
- |
- |
- |
Remote Request |
1 |
1 |
1 |
Remote Unit of Work |
>1 |
1 |
1 |
Distributed Unit of Work |
>1 |
>1 |
1 |
Distributed Request |
>1 |
>1 |
>1 |
The result of moving up the levels is additive. For example, distributed request capability implies distributed unit of work (which in turn implies remote unit of work). The reverse, however, is not implicitly true.
Let's move on to discuss these levels at greater length.
User-Assisted Distribution
User-assisted distribution is the simplest form of data distribution. However, under this DRDA level, the end user is aware of the distribution and participates in accomplishing the distributed access. To accomplish user-assisted distribution, the user must:
This is an intensive procedure that should not be taken lightly. As it involves replicated data, care must be taken to document the system of record and the date of extraction in case future modification is permitted.
Even given its many limitations, user-assisted distribution is useful for producing snapshot tables and satisfying one-time requests. However, to many, user-assisted distribution is not truly distributed data access. I tend to agree with them.
Often-times, user-assisted distribution is not even included in a formal discussion of DRDA. However, I include it here for completeness.
Remote Request
Remote request is the first level of true distribution within DRDA. When a DBMS support DRDA remote request capability, a single SQL statement can be issued to read or modify a single remote RDBMS within a single unit of work.
Simply stated, remote request enables developers to be operating within one RDBMS, and refer to a different RDBMS. Furthermore, it is possible to utilize remote request capability to access a remote RDBMS, even if a local RDBMS is not being used.
DRDA remote request provides the capability of issuing only one SQL request per unit of work, and only one RDBMS per SQL request.
Remote Unit of Work
The remote unit of work (RUW) DRDA level adds to the functionality of remote request. RUW allows multiple SQL statements. However, the SQL can only read and/or modify a single remote RDBMS within a single a unit of work.
To clarify, within the scope of a commit, RUW can access only one RDBMS.
So, DRDA remote unit of work provides the capability of issuing multiple SQL requests per unit of work, but still can access only one RDBMS per SQL request.
Distributed Unit of Work
Distributed unit of work (DUW) builds onto the functionality of remote unit of work. More than one RDBMS can be accessed per unit of work.
Simply stated, DRDA DUW enables multiple SQL statements to read and/or modify multiple RDBMSs within a single unit of work. However, only one RDBMS can be specified per SQL statement.
As with any unit of work, all of the SQL statements within the commit scope either succeed or fail. This requires a two-phase commit protocol to be established. Distributed two phase commit is functionally equivalent to the two phase commit DB2 performs when executing under CICS or IMS/TM. When a DUW program issues a COMMIT, the two-phase commit protocol must synchronize the commit across all affected platforms.
Distributed Request
DRDA distributed request capability enables complete data distribution. Using distributed request, the DUW restriction of one RDBMS per SQL statement is removed. Additionally, multiple SQL requests, both distributed and non-distributed can be contained within a single unit of work.
Simply stated, distributed request enables a single SQL statement to read and/or update multiple RDBMSs at the same time.
There are no RDBMS products that currently provide DRDA distributed request capability.
Putting It All Together
Consider a scenario where three remote processing locations are set up, each with a RDBMS: Pittsburgh, Chicago, and Jacksonville. Let’s examine how each of the four DRDA options could access distributed data from these locations.
Consider a situation whereby we need to access specific columns from tables at each remote location. There are four tables total: two in Pittsburgh and one each in Chciago and Jacksonville. Furthermore, assume that the requests are emanating from Chicago. In a remote request scenario, we can access only a single RDBMS from a single location in a single unit of work. The request to the Chicago table is a local request; the Pittsburgh and Jacksonville requests are remote. Each request is within a single unit of work (indicated by a COMMIT). There are four total UOWs, one each for Jacksonville and Chicago, and two for Pittsburgh, one for each SELECT to the two table that exist in Pittsburgh.
Now contrast remote unit of work functionality with remote request. Instead of a single statement per unit of work, multiple statements can be issued. So, the Pittsburgh request can now consist of both SELECT statements (one for each table) within the same unit of work. With RUW we have reduced the total UOWs to three (from four).
Next, distributed unit of work enables multiple RDBMSes per unit of work. All four tables from all three locations can be accessed within one unit of work using DRDA DUW functionality. We now are down to a single UOW.
Finally, we have distributed request. Using distributed request, multiple RDBMSes from multiple locations can be accessed using a single SQL statement. In this scenario, the application requester sends a request to the Chicago application server, which in turn sends the request to the Chicago database server. It processes what it can and passes it to one of the other database servers (in, say, Pittsburgh), and so on. With distributed request not only do we have only one UOW, we can have one SQL statement joining tables across locations.
Distributed DB2
Keep in mind, though, that this excerpted chapter covers the DRDA framework only. It has not discussed actual implementation in DB2, which only implements partial DRDA support.
|