RE: Help with Docbook

Subject: RE: Help with Docbook
From: "Mark Baker" <mbaker -at- ca -dot- stilo -dot- com>
To: "TECHWR-L" <techwr-l -at- lists -dot- raycomm -dot- com>
Date: Thu, 24 Jul 2003 12:19:10 -0400

Jan Henning wrote:

> - HTML is not strict.

In the sense that it is well defined, HTML is perfectly strict. HTML is a
well defined application of SGML. There is not the slightest ambiguity as to
its structure.

In the sense that it allows a wide variety of different combinations of
elements, HTML is not strict, but neither is DocBook.

> That means that you cannot be sure that any HTML
> you receive follows given formal standards;

Browser error correction features means that some incorrect HTML displays as
intended, so people do create and distribute non-compliant HTML. However,
you can easily tell if your HTML is correct or not using freely available
SGML validators.

> it also means that you
> cannot be sure how your HTML will be interpreted.

You cannot be sure how any document in any format will be interpreted.
That's up to the processing application.

> XML was specifically
> constructed to avoid that problem, which means that it is much easier
> to contruct autmoated tools for processing XML (and, by extension,
> DocBook).

There is simply no difference between an SGML based language and an XML
based language in this regard. If browsers apply the same error correction
behavior to XHTML as they do to HTML, then you will see lots of invalid
XHTML floating around (supposing anybody ever bothers to switch.)

> - HTML is almost purely concerned with appearance. It does not provide
> for more than the most rudimentary semantics.

Actually, HTML was designed to mark up the document structures of scientific
papers, just as DocBook is designed to mark up the document structures of
technical documents. HTML suffered the fate of all document structure
languages, however. People started using its document structure elements,
like <blockquote>, to achieve the formatting effects they wanted from the
most popular display applications (browsers). The exact same thing happens
with DocBook and with all document structure languages.

> - HTML is not extensible in a well-defined manner. XML and DocBook are.

XML is to DcoBook and HTML as the Latin alphabet is to English and French.
Both English and French are extensible by adding new words. The Latin
alphabet is extensible by adding new letters, but this happens much more
slowly, and it no longer happens at all because all the significant
applications of the alphabet have stabilized on the set of letters already

HTML and DocBook are both extensible in that you can add new tags. However,
existing applications will ignore those tags (or choke on them) and so you
have to write your own tools to handle them.

> Adopting DocBook means adopting XML, specifically, a perticular
> convention based on XML.

But, as you say, DocBook is a particular convention. When you adopt it you
are adopting that convention. You can choose to adopt that convention using
the SGML syntax or using the XML syntax. You could choose to adopt that
convention using ASN1 syntax, though you would then have to roll your own
tools or convert the ASN1 representation into an XML or SGML representation
to use the existing DocBook tools.

If you decide to create your own tagging language to support your own
semantics and your own applications, then you are adopting XML. Otherwise
you are just adopting a particular set of semantics that happens to use XML
to express its syntax.

> DocBook offers semantic structure for
> documents because of its XML roots.

No, DocBook describes document structure semantics because it has a set of
elements that describe document structures. You do not need XML to do this.
Any convention that is capable of expressing hierarchical constructions will
do just as well. Similarly, XML syntax can be used to express any kind of
semantics you like: formatting semantics, document structure semantics, or
subject matter semantics.

It is really important not to get these two things confused. Far too many
people read that you can do single sourcing with XML, then they read that
such and such a format uses XML, and conclude that if they adopt that format
they will automatically get single sourcing. It just does not work that way,
but people spend millions of dollars chasing that pipe dream.
Mark Baker
Stilo Corporation
1900 City Park Drive, Suite 504 , Ottawa, Ontario, Canada, K1J 1A3
Phone: 613-745-4242, Fax: 613-745-5560
Email mbaker -at- ca -dot- stilo -dot- com

This message, including any attachments, is for the sole use of the
intended recipient and may contain confidential and privileged
information. Any unauthorized review, use, disclosure, copying, or
distribution is strictly prohibited. If you are not the intended
recipient please contact the sender by reply email and destroy
all copies of the original message and any attachments.


sourcing tool for FrameMaker that lets you easily publish your content
online. No macro language required!

Mercer University's online MS Program in Technical Communication Management:
Preparing leaders of tomorrow's technical communication organizations today.
See or write George Hayhoe at hayhoe_g -at- mercer -dot- edu -dot-

You are currently subscribed to techwr-l as:
archive -at- raycomm -dot- com
To unsubscribe send a blank email to leave-techwr-l-obscured -at- lists -dot- raycomm -dot- com
Send administrative questions to ejray -at- raycomm -dot- com -dot- Visit for more resources and info.


Re: Help with Docbook: From: Jan Henning

Previous by Author: RE: Tool for dynamic, customized documentation
Next by Author: RE: Help with Docbook
Previous by Thread: Re: Help with Docbook
Next by Thread: Re: Help with Docbook

What this post helpful? Share it with friends and colleagues:

Sponsored Ads