News Stay informed about the latest enterprise technology news and product updates.

The Roots of SGML -- A Personal Recollection (Part 2)

Dr. Charles F. Goldfarb, the father of XML technology, continues with his interesting perspective of the roots of generalized markup.

Part II: The Roots of SGML -- A Personal Recollection
By Dr. Charles F. Goldfarb

Continued from Page One

IBM's Document Composition Facility:
Industrial-Strength GML

In 1975 I took a position as a market planner for IBM's printer products in San Jose, CA. The move accomplished two long-held goals: Linda got to give our sons' snowsuits to charity, and I got a chance to prove there was a business case for a GML-based document composition product. The product was officially called the "Document Composition Facility" (DCF), but everyone called it "Script". It was derived from the language, designed by Stewart Madnick in the late 1960's, that was used in the Integrated Text Processing project.

I developed a cost-justification model, based on market surveys and case studies, that showed the enormous value of generalized markup over the procedural markup that was common at the time. On the strength of this, GML support was added to Script. Geoff Bartlett developed a macro language with built-in SGML functions, including controls for delimiter assignment and association of element types with processing procedures.

Peter Huckle, DCF's Chief Programmer, designed and implemented a notable "starter set" application, the precursor of the "General Document" in ISO 8879. The implementation was done entirely in the macro language, which was also available to the product's users. The application design was driven by the needs of IBM publishing, as chiefly articulated by Truly Donovan, the first professional document type designer. Truly was also the leader of what was surely the first multi-site, multinational, generic markup project.

Here's a markup example:

:h1.Chapter 1: Introduction
:p.GML supported hierarchical containers, such as
:li.Ordered lists (like this one),
:li.Unordered lists, and
:li.Definition lists

as well as simple structures.

:p.Markup minimization (later generalized and formalized in SGML), allowed the end-tags to be omitted for the "h1" and "p" elements.

The DCF GML User's Guide (IBM SH20-9160), which I wrote in 1978, includes the first published formal document type "descriptions" (DTDs), for this "General Document" and also for a "GML Markup Guide" document type. The General Document example, except for the delimiter strings, should look very familiar. It was not only the source for the homonymous DTD in ISO 8879, but also, thanks to Anders Berglund's championing of DCF at CERN, it was the source for the World Wide Web's HTML document type as well. The User's Guide itself became the first working paper of the ANSI SGML committee (X3J6/78/33-01).

Before DCF, sophisticated GML applications existed only in a research environment. DCF was a commercial product, subject to all the constraints of what was then the largest and highest-quality software development organization in the world. And it was designed to support the requirements of the world's second-largest publisher. Though not technical in nature, these considerations proved vital for SGML. The World Wide Web, for example, succeeded commercially while many nobler, more technically interesting hypermedia systems proved only of academic interest, because of the Web's artful compromise in connecting technology to the needs of a real user community. DCF and GML succeeded for the same reason. Chuck Cooper was the product planner who made that vital connection for DCF.

DCF/GML, which is still widely used today, has probably produced more pages of output than any other single generalized markup product. It established beyond doubt the viability of generalized markup, and initiated the major change (still going on) in the way that large enterprises view their document assets. The SGML community owes a real debt to IBM and to the many talented and dedicated (present and former) IBM people who made it possible.

Conclusion: 30 Years of Generalized Markup

This memoir has focused on the roots of SGML: The people and activities that directly influenced the invention of the language and, ultimately, the development of the standard (two very different things). Those roots were solidly planted in the industrial sector, but it is worth noting that there were other descriptive markup activities going on in the academic world.

Brian Reid's Scribe system, for example, begun at Carnegie-Mellon in 1976, had independently arrived at several of the key concepts of SGML, though many years later. Brian, however, personally influenced SGML by encouraging me to write "A Generalized Approach to Document Markup" for SIGPLAN Notices in June 1981. That paper eventually became -- after a global change from "GML" to "SGML" -- Annex A of ISO 8879.

I like to think of the history of SGML as -- what else -- a tree structure. One root -- from Rice to GML to my basic SGML invention -- joined at the base of the trunk by the other -- Tunnicliffe to Scharpf and GenCode. The trunk, of course, is the extraordinary 8-year effort to develop ISO 8879, involving hundreds of people from all over the world. The products and tools that came after are the branches, the many applications the leaves, and they are all still growing.

And for all these 30 years, while the technologies of both computers and publishing have undergone overwhelming and unpredictable changes, the tree continues to bear the fruit that I described in 1971:

The principle of separating document description from application function makes it possible to describe the attributes common to all documents of the same type. ... [The] availability of such 'type descriptions' could add new function to the text processing system. Programs could supply markup for an incomplete document, or interactively prompt a user in the entry of a document by displaying the markup. A generalized markup language then, would permit full information about a document to be preserved, regardless of the way the document is used or represented.

Go to Page One

Copyright 2001, Dr. Charles F. Goldfarb


Talk back or comment on this Article

A Brief History of the Development of SGML

Best Web Links for XML

Dig Deeper on Topics Archive

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.