Modularization of XHTML in XML Schema
Just in time for Christmas (on December 19, 2001), the W3C released a working draft of its recasting of the XHTML Modularization specification from its original DTD basis to a version based instead on XML Schema. True, this working draft is not a Christmas gift about which most children -- nor to be honest, too many adults, except for die-hard XML Schema enthusiasts -- would dream fondly. But this document offers a fascinating opportunity to explore the differences between XML document descriptions based on DTD versus those based on XML Schema at just about any conceivable level.
In my brief perusal of this document -- and while I may not qualify as an XML Schema enthusiast, I confess quite cheerfully to being an "interested party" -- I've observed lots of useful ways to compare and contrast DTD-based document definitions versus those based on XML Schema that may be of interest to other readers as well. Here are a few of my most important observations:
- Compactness and intelligibility: While DTDs clearly win the comparison based purely on the number of characters required to define document elements, content models, and so forth, anyone with a knowledge of XML can read and understand XML Schema definitions. The same is not true for DTDs, which also require at least some working knowledge of SGML (and to be frank, more knowledge usually improves the level of understanding).
- Convenience: A toss-up: whereas there's no denying the convenience that ubiquitous use of character entities enables when building DTDs, the ease and power of creating datatypes and imposing value constraints on data in XML Schemas is equal and entirely offsetting (IMHO, anyway).
- Strong typing: Computer language experts have strong opinions on the merits of strong typing mechanisms, and I have no desire to relaunch earlier "holy wars" in other venues. Suffice it to say on this point that XML Schema's strong typing mechanisms make defining and typing document elements a breeze, and add welcome structure and controls to related XML documents. DTDs really have no equivalent capability.
Some of the best information, in general, in this document occurs in section 2, entitled "Schema Modularization Framework." Here, you'll find the most direct information on how DTD constructs and definition techniques map into equivalent or near-equivalent XML Schema constructs and definition techniques. Section 18.104.22.168. "Mapping Summary" is particularly useful because it quickly and clearly explains how DTD entities and elements map into Schema equivalents.
Although this document is not an especially quick or easy read, if you're interested in understanding the differences between XML Schema and SGML DTDs or simply want to understand how XML Schema works in defining a well-known XML application (XHTML, that is), you'll find time spent with this document well repaid. Enjoy.
Have questions, comments, or feedback about this or other XML-related topics? Please e-mail me in care of firstname.lastname@example.org; I'm always glad to hear from my readers!
Ed Tittel is a principal at LANWrights, Inc., a wholly owned subsidiary of LeapIt.com. LANWrights offers training, writing, and consulting services on Internet, networking, and Web topics (including XML and XHTML), plus various IT certifications (Microsoft, Sun/Java, and Prosoft/CIW).