Manage Learn to apply best practices and optimize your operations.

Efficient XML Interchange tackles data verbosity

Coming out of the work on Binary XML, Efficient XML Interchange compresses the notoriously verbose data format on top of adding processing efficiency.

Since December 2005 and through the end of this year, an XML working group has been quietly laboring away under the auspices of the W3C. When you learn that EXI, as Efficient XML Interchange inevitably becomes when abbreviated, is part of the XML Binary Characterization work project, this may begin to make some sense. When this is further elaborated to explain that EXI seeks to define a compact and portable format for binary-encoded (rather than character-encoded) XML content, the light bulb should hopefully brighten considerably. That's because a long-standing Achilles heel of XML has been its verbose, if not downright prolix encapsulation of the data and markup it seeks so rigorously to convey.

In July of this year (2007) the work of this group became something more than one of many works in progress at the W3C, when the first public working draft of the EXI Format Specification was released. At around the same time (and more recently updated on July 25) a "measurements note" was also released. In keeping with work underway at the parallel XML Binary Characterization Working Group, measurements of this format's compactness (a measure of in-memory and other stored representations of the format), processing efficiency (speed with which the format can be generated or consumed for processing as compared to text-based XML), and roundtrip support (the ability to convert XML to the format, and then to convert back to XML with output equivalent to the original input) were included.

For the entire, incredibly detailed story, please consult the measurements note cited in the preceding paragraph. To shorten this saga as much as possible, Efficient XML did well on the testing, in large part because it is schema aware and can use schema information to compress and decompress XML data based on one or more schemas, as well as to speed processing between text- and binary-based representations. For some interesting insights as the whys and wherefores, see Santiago Pericas-Geersten's Blog on the EXI draft, which recounts his explanation as to what gives EXI an edge over its primary competitor for representing XML in binary formats, known as Fast InfoSet.

As to why developers should care about binary representations for XML, this could easily be recast as a question as to why they should care about binary representations for any of the code that they write. This brings all the right factors into play: small resource footprint, optimized processing and transmission of data, and more efficient engines to handle XML when in binary format (to which the observation that XML is always in binary format when it lives on a computer anyway should surely be applied).

This may not be a technology that many developers need to confront directly, but it will surely have a significant impact on how XML is produced, consumed and stored forever after a standard and vetted implementation becomes available. Hence, this material is well worth investigation.

About the author

Ed Tittel is a full-time writer and trainer whose interests include XML and development topics, along with IT Certification and information security topics. Among his many XML projects are XML For Dummies, 4th edition, (Wylie, 2005) and the Shaum's Easy Outline of XML (McGraw-Hill, 2004). E-mail Ed at with comments, questions or suggested topics or tools for review.

Dig Deeper on Topics Archive

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.