XML and databases
XML is a standard language used to create extensible data structures (XML schemas) and documents compatible with these structures (XML documents). The scope for applying XML is almost unlimited: it can be used anywhere that data is found - that is, anywhere in an information system, and particularly in databases.
Databases play an essential role within information systems. They fall into three main categories: document, object, and relational databases. After the advent of XML technology came the arrival of native XML databases, along with a new topic of debate: should there be a new type of database to manage XML?
In this article we shall look at the key points that will enable us to determine the role of XML with regard to databases, and to pinpoint the opportunities for acquiring a native XML database.
It might be useful to approach this issue from the point of view of the type of data that is required in an application. This will make it easier to analyze the frame of use of XML and how appropriate it is to opt for a native XML database. To do this, we will need to distinguish between the data-oriented approach, and the document-oriented approach.
A data-oriented XML file follows a precise structure, with fine-grained elements, which are often subject to strict constraints. These elements might be invoices, accounting entries or orders, for example. For an application handling this type of data, the advantage of representing the data in XML is that this will facilitate communication with other applications.
It will be necessary to perform conversions between the internal format of the data (relational, object, etc.) and the hierarchical structure of the XML file used. The life cycle of the XML file will therefore be limited to the transfer time between applications. Using XML as a storage format is not highly useful, and can even compromise the design and architecture of applications.
There are frameworks on the market that enable object and relational data to be transformed into XML. Some are marketed by database vendors (IBM DB2 XML extender, Sybase ASE or Microsoft SQL Server XML extensions, etc.), while others come from transactional middleware vendors (Exolab Castor JDO, JAXB & JAXP implementations, Apache Axis, BEA WebLogic Workshop, and so on).
Native XML databases also have their advantages (for new applications) as they avoid the efforts required to convert XML files, and lead to good performances.
A document-oriented XML file has little structure and the elements are more coarse-grained. Unlike the data-oriented approach, a document-oriented XML file must be conserved in its original form and has a long lifecycle. Its existence may be governed by a workflow. It must be indexed, categorized or even semanticized in order to be found and used. It should also be possible to associate access security attributes with its structure.
Content Management Systems (CMS) fulfill these functions particularly well, and offer operational XML extensions. A CMS infrastructure is generally based on a file system, relational or object database or a proprietary tool. This infrastructure is then "hidden" by a framework that provides the notion of document and related aspects (workflow, security, semantics, and so on).
Native XML databases may also be a solution, offering the advantages of native support for XML document standards (DOM, SAX, XML Query and Xpath) and the resulting good performances.
So is it worth acquiring a native XML database?
It is not simply the use of the XML language that should guide the choice of database, but the way in which it is used.
Vendors did not wait for XML to appear on the scene in order to implement structured data storage systems. Object and relational databases are comfortably established in this field and do not suffer from any shortcomings that would justify their being replaced. Application servers and database XML extensions enable applications to be opened up via XML interfaces - the notion of Web Services.
The argument of enhanced performances put forward by native XML database vendors is a fragile one, as the stakes are rarely high: applications are often uncoupled and communicate in asynchronous mode. Furthermore, although it is valid to argue the importance of avoiding XML conversion stages, this does not appear to justify the overheads and risks incurred by the technological leap required.
Object/XML and relational/XML conversion tools are well positioned to resolve the issue and their impact on architectures is less significant. Indeed, the market is a dynamic one and the productivity of tools is increasing.
The content management issue is also well covered by CMS tools, which offer additional XML extensions. Native XML databases do offer enhanced support for the XQuery and Xpath standards, although these are still relatively young. Native database vendors offer workflow or indexing capabilities that tend to be less well-proven than CMSs.
Justifying the advantages of a native XML database is not an easy task, given the number of tools that allow XML to be integrated with minimum impact on existing databases. The native XML database market is struggling to take off, and it seems likely to become a niche market.
Copyright 2002 TechMetrix Research. TechMetrix is a technology-oriented analyst firm focused on e-business application development needs. TechMetrix is also backed by its parent company, a European global system integrator - SQLI - with more than 800 developers in the field.
For more information:
- Looking for free research? Browse our comprehensive White Papers section by topic, author or keyword.
- Are you tired of technospeak? The Web Services Advisor column uses plain talk and avoids the hype.
- For insightful opinion and commentary from today's industry leaders, read our Guest Commentary columns.
- Hey Codeheads! Start benefiting from these time-saving XML Developer Tips and .NET Developer Tips.
- Visit our huge Best Web Links for Web Services collection for the freshest editor-selected resources.
- Visit Ask the Experts for answers to your Web services, SOAP, WSDL, XML, .NET, Java and EAI questions.
- Couldn't attend one of our Webcasts? Don't miss out. Visit our archive to watch at your own convenience.
- Choking on the alphabet soup of industry acronyms? Visit our helpful Glossary for the latest lingo.
- Discuss this article, voice your opinion or talk with your peers in the SearchWebServices Discussion Forums.