Craig S. Mullins |
|
December 2000 |
|
|
Using TIMESTAMP Columns Versus DATE and TIME Columns By Craig S. Mullins A frequent requirement of applications is to store date and time information in the of a DB2 table. The date and time data may be required for any number of reasons, such as:
Of course, this list is not exhaustive. There could be many other causes for requiring date and time data in DB2 tables. But the purpose of this article is really not to discuss why the date and time data is required, but to discuss the methods for storing this information in a DB2 table and the tradeoffs for choosing one method over another. DATE, TIME, and TIMESTAMP It is possible to store date and time data in the row of a DB2 table using two columns or one. The first method is to create one column of DATE data type and one column of TIME data type. For example: CREATE TABLE T1 (. . . NEW_DATE DATE NOT NULL WITH DEFAULT, NEW_TIME TIME NOT NULL WITH DEFAULT, . . .); The second method is to use a single TIMESTAMP column. For example: CREATE TABLE T1 (. . . NEW_TS TIMESTAMP NOT NULL WITH DEFAULT, . . .); Which is the better solution? Of course, the answer is, "it depends." Let’s examine "what" it depends on. Keeping It Simple With DATE and TIME you must use two columns. TIMESTAMP uses one column, which might simplify data modification. If the entire date and time is always changed each time the column is updated, then the TIMESTAMP solution should be easier to implement from a modification perspective. However, just the reverse may be true for data access or if the data and time ever need to be changed independent of one another. If you need to access the time component separately from the date component, it will be necessary to use functions to break these components apart from one another in a TIMESTAMP column. For example, SELECT DATE(NEW_TS) FROM T1; SELECT TIME(NEW_TS) FROM T1; The DATE() and TIME() functions will return only the date and time components of a timestamp column or expression. This is a bit more complicated than simply selecting two columns that are already in the right format. And you will lose the microseconds component of the timestamp value if you convert it using the TIME() function. Storage Requirements The combination of DATE and TIME columns requires 7 bytes of storage. A DATE column will use 4 bytes and a TIME column will use 3 bytes of storage. A TIMESTAMP column always requires 10 bytes of storage. This is true even though the external representation of a timestamp uses 26 bytes, the external representation of time data uses 8 bytes and the external representation of a date uses 10 bytes. So, using the combination of DATE and TIME columns will save space. Storage requirements usually should not dictate database design issues,
though. It is more important to ensure that the database design will satisfy the
requirements of the application and users of the data. Therefore criteria such
as precision and ease of use are usually more important than the amount of
storage required. However, in certain circumstances when there are severe
storage shortages at your shop, a savings of 3 bytes per row might be an
influencing factor in your database implementations. Accuracy The TIMESTAMP approach will provide greater time accuracy than the DATE and
TIME approach. The DB2 TIMESTAMP data type stores time information to the
microsecond level. The TIME data type provides accuracy only to the seconds. So,
if precision is important, use TIMESTAMP. Use TIME if you want to ensure that
the actual time is NOT stored down to the microsecond level. Date and Time Arithmetic There are additional considerations if you need to perform date and time arithmetic on the columns. Date and time arithmetic may be easier to implement using TIMESTAMP data. This is the case because you might be able to get by with a single calculation using TIMESTAMP data versus a combination of DATE and TIME calculations that will then need to be combined. Subtracting one TIMESTAMP from another results in a timestamp duration. A timestamp duration is a number represented in DECIMAL(20,6) format. It consists of a number of years, months, days, hours, minutes, seconds, and microseconds. The number will be expressed in the following format: yyyyxxddhhmmsszzzzzz where yyyy is the number of years, xx is the number of months, dd is the number of days, hh is the number of hours, mm is the number of minutes, ss is the number of seconds and zzzzzz is the number of microseconds. For example: 2000-07-31-10.30.15.610208 – 2000-06-20-09.25.08.508928 = 00000111010507.101280 The result indicates that there were 0000 years, 01 month, 11 days, 01 hour, 05 minutes, 07 seconds, and 101280 microseconds between the two timestamps. To calculate a duration using DATE and TIME columns, two subtraction operations must occur: one for the DATE column and one for the TIME column. This will result in a date duration and a time duration that may need to be combined together (depending on the further processing required by the application for the date and time durations). To be clear, let’s define date and time durations. A date duration is a DECIMAL(8,0) that consists of a number of years, months, and days. It is of the format: yyyymmdd with yyyy representing the number of years, mm the number of months, and dd the number of days. A time duration is a DECIMAL(6,0) number of the format: hhmmss with hh representing the number of hours, mm the number of minutes, and ss the number of seconds. Formatting If formatting the date and time data is a major consideration for your application then DATE and TIME may be a better solution. DB2 provides formatting options for DATE and TIME columns via local DATE and TIME exits. This means that you can specify a DB2 subsystem-wide default for the way that date and time values should be displayed. Likewise the CHAR function and DATE and TIME precompiler options exist for formatting DATE and TIME data. If the date and time information is to be extracted and displayed on a report or by an online application, the availability of these DB2-provided facilities for DATE and TIME columns should be considered when making your decision. These facilities are not available for TIMESTAMP columns. There is a single format for a timestamp value in DB2, and it looks like the sample shown previously. A timestamp value always begins with the date component, in the format YYYY-MM-DD combined using a hyphen ("-") to the time component in the format: hh.mm.ss.zzzzzz. Of course, you can use the DATE() and TIME() functions to separate the components of the timestamp value and then display them the same as DATE and TIME columns. Of course, once again, you will lose the microseconds component of the timestamp value when you convert it using the TIME() function. Primary Key Considerations Sometimes DB2 databases rely on timestamps to automatically generate primary key values. This is a useful tactic when there is no obvious primary key or when multiple column primary keys would be difficult to implement effectively. To use timestamps for primary key values you can simply create a column and assign it the TIMESTAMP data type with the specification NOT NULL WITH DEFAULT. When inserting rows do not provide a value for the column and DB2 will automatically assign the column a value equal to the current timestamp at the time the row is inserted. This approach can work well unless you are inserting new values so rapidly that the two rows are inserted at the same exact time (down to exactly the same microsecond). Be sure to provide retry logic in your application program in case duplicate key values are inserted. Check for SQLCODE –803, and if received, simply retry the INSERT. It is possible to get duplicate key values even when the insert rate is low if two transactions try to insert to the table at exactly the same time. The DATE and TIME approach does not work well for primary keys. Since TIME value provide accuracy only to seconds, the frequency of duplicate inserts would be much, much greater. Therefore, do not use the DATE and TIME approach if you are going to use the date/time values for primary keys. Use a TIMESTAMP column instead. As of DB2 V7 (and with a PTF, DB2 V6) you can use the new IDENTITY feature to automatically generate sequential values for primary keys. This approach is even better than the TIMESTAMP approach and should be used if it is available on your DB2 subsystem. It is better because DB2 will automatically generate a unique number and you will not need to worry about duplicates. It should also be more efficient than the TIMESTAMP approach. Summary So, there is no clear-cut answer as to which is better: TIMESTAMP or a combination of DATE and TIME columns. You will need to examine your application design requirements and modify your database design accordingly. Favor DATE and TIME if storage is at a premium, you only require time accuracy to seconds, and you have strict date and time formatting requirements. However, if your application requires complicated date and time arithmetic, time accuracy greater than seconds, or automatic timestamp primary key values, then favor the TIMESTAMP solution. If your application does not fall into either of these broad categories, then favor TIMESTAMP over the DATE and TIME solution because it is usually simpler to implement.
From DB2 Update (Xephon) December 2000. © 2000 Craig S. Mullins, All rights reserved. |