XML Developer Tip
(Receive this column in your inbox,
click Edit your Profile to subscribe.)
Try Bill's XML fragment reader
Anybody who deals with XML data on a regular basis must also often deal with chunks of XML data not available as complete, well-formed XML documents. Most off-the-shelf XML parsers need complete documents, so we've learned that XML tools that can read documents containing XML-formatted data or records can be extremely handy. They might, for example, perform an interim analysis of survey data while a survey is still underway, so the resulting XML document file cannot yet be closed or complete.
My colleague, co-author, and XML programming whiz, Dr. Bill Brogden, has developed a tool that lets him read and interpret streams of XML data that aren't complete or fully fleshed out. This solution is covered in Bill's recent article for XML.com, entitled "An XML Fragment Reader". Essentially Bill's tool permits multiple XML formatted character streams to be combined, and then feed the result to an XML parser. He builds on the XML parser classes included in the Java 1.4 SDK, which permits an InputSource object in the org.xml.sax package to feed a character stream to either a SAX or a DOM parser.
By creating a class to extend Reader, the basic Java class for streams of characters, Bill supplied characters to an input stream from various character-stream sources. Bill's example uses String and File objects, but any object that can create a Reader will do the job. Bill's example uses a class called XMLfragmentReader. He performs a couple of interesting tricks we've grown to appreciate in the last few years of writing XML-based applications:
- When any of his reading methods is informed that the current source of input is exhausted, it calls a nextReader method to open any pending sources. If no sources are pending, it's finished reading, cleans up, and then shuts down.
- Anytime a method reads character data, it looks for end-of-line markers or characters, and maintains a line count as it reads the data. This makes it much easier to associate parsing errors with specific lines of input, and really helps debugging.
Bill performs some equally interesting tricks with event handling by extending the DefaultHandler class in the org.xml.sax.helpers package, to report on problems that the parser encounters as it chunks through various character streams. He also provides the capability to create a complete document from the XML fragments in incoming streams to make it easy to pass output from the fragment reader to other software for post-processing.
All in all, XML developers who write applications that produce growing accumulations of data will find this tool helpful in working with that data while the data collection process is still underway. Because we've found it useful in our own work, we think you'll probably find it worthwhile as well.
About the Author
Ed Tittel is a 20-plus year veteran of the computing industry, who's worked as a programmer, manager, systems engineer, instructor, writer, trainer, and consultant. He's also the series editor of Que Certification's Exam Cram 2 and Training Guide series, and writes and teaches regularly on Web markup languages and related topics.
For More Information:
- Looking for free research? Browse our comprehensive White Papers section by topic, author or keyword.
- Are you tired of technospeak? The Web Services Advisor column uses plain talk and avoids the hype.
- For insightful opinion and commentary from today's industry leaders, read our Guest Commentary columns.
- Hey Codeheads! Start benefiting from these time-saving XML Developer Tips and .NET Developer Tips.
- Visit our huge Best Web Links for Web Services collection for the freshest editor-selected resources.
- Visit Ask the Experts for answers to your Web services, SOAP, WSDL, XML, .NET, Java and EAI questions.
- Choking on the alphabet soup of industry acronyms? Visit our helpful Glossary for the latest industry lingo.
- Couldn't attend one of our Webcasts? Don't miss out. Visit our archive to watch at your own convenience.
- Discuss this article, voice your opinion or talk with your peers in the SearchWebServices Discussion Forums.