Craig S. Mullins Database Performance Management |
|
December 1999 | |
XML Marks
The Spot
By
Craig S. Mullins XML is getting a lot of publicity these days. If you believe everything you read, then XML is going to solve all of our interoperability problems, completely replace SQL, and possibly even deliver world peace. Okay, that last one is an exaggeration, but you get the point. In actuality, XML stands for eXtensible Markup Language. The need for extensibility, structure, and validation is the basis for the evolution of the web towards XML. XML, like HTML, is based upon SGML (Standard Generalized Markup Language) which allows documents to be self-describing, through the specification of tag sets and the structural relationships between the tags. HTML is a small, specifically defined set of tags and attributes, enabling users to bypass the self-describing aspect for a document. XML, on the other hand, retains the key SGML advantage of self-description, while avoiding the complexity of full-blown SGML. So
What? XML allows tags to be defined by users that
describe the data in the document. This capability provides users a
means to describe the structure and nature of the data in the
document. In essence, the document becomes self-describing. The simple syntax of XML makes it easy to process by
machine while remaining understandable to humans. HTML uses tags to
describe the appearance of data on a page. For example the tag,
“<b> text </b>”, would specify that the “text”
data should appear in bold face. XML uses tags to describe the data
itself, instead of its appearance. For example, consider the following
XML describing a customer address: <CUSTOMER> XML is actually a meta language for defining other markup
languages. These languages are collected in dictionaries called
Document Type Definitions (DTDs). The DTD stores definitions of tags
for specific industries or fields of knowledge. So, the meaning of a tag must be defined in a
"document type declaration" (DTD), such as: <!DOCTYPE CUSTOMER [ The DTD for an XML document can be either part
of the document or stored in an external file. The XML code samples
shown are meant to be examples only. By examining them you can quickly
see how the document itself describes its contents. For data
management professionals, this is beneficial because it removes the
trouble of trying to track down the meaning of data elements. One of
the biggest problems associates with database management and
processing is tracking down and maintaining the meaning of stored
data. If the data can be stored in documents using XML, the documents
themselves will describe their data content. The important thing to remember about XML is
that it solves a different problem than HTML. HTML is a markup
language, but XML is a meta-language. In other words, XML is a
language that generates other kinds of languages. The idea is to use
XML to generate a language specifically tailored for each requirement
you encounter. It is essential that you understand this paradigm shift
in order for you to understand the power of XML. Some SkepticismHowever,
there are some problems with XML. For example, standard web browsers
do not currently understand the descriptive tags. This problem will be
alleviated in time as XML-capable web browsers come to market. Another
problem with XML is not really the fault of XML, but of market hype.
There is a lot of confusion surrounding XML in the industry. Some
folks believe that XML will provide metadata where none currently
exists or that XML will replace SQL as a data access method for
relational data. Neither of these assertions are true. There
is no way that any technology, XML included, can conjure up
information that does not exist. Humans must create the metadata tags
in XML for the data to be described. XML enables self-describing
documents. It does not describe your data for you. And
XML does not do what SQL does. Hence, XML cannot replace SQL. SQL is
the standard access method for relational data. It is used to
“tell” a relational DBMS what data is to be retrieved. XML is a
document description language. It describes the contents of data. XML
may be useful for defining databases, but not for accessing them. Summary
But skepticism aside, XML is definitely the wave of the immediate future. The future of the web will be defined using XML. The benefits of self-describing documents are just too many for XML to be ignored. Furthermore, being able to use XML to generate an application-specific language is powerful. This capability will drive XML to the forefront of computing.
From Database
Trends, December 1999. |