RE: Help with Docbook

Subject: RE: Help with Docbook
From: "Mark Baker" <mbaker -at- ca -dot- stilo -dot- com>
To: "TECHWR-L" <techwr-l -at- lists -dot- raycomm -dot- com>
Date: Thu, 24 Jul 2003 11:08:45 -0400

Christopher Gooch wrote:

> However I had assumed
> that it would make sense for me to adopt Simplified DocBook
> and then add my further tags, as I see no need to re-invent wheels.

An object marked up in XML essentially consists of three layers.

1. The macro-semantic layer that identifies the object, its subject matter,
and its connections to other topics and or subjects.

2. The text structure layer, which consists of those elements needed to
describe the necessary structures of text like paragraphs, lists, and

3. The micro-semantic layer that identifies the relationships of items
mentioned in the text to other topics or subjects.

Basically, you want your own macro-semantics and you want your own micros
semantics and you want a text structure layer that is as simple as possible
and as easy to use as possible. You can borrow the text structure layer from
DOCBOOK if you like, but I like to borrow it from HTML (with additional
restrictions not found in HTML). There is an immediate usability advantage
in borrowing from HTML, both because of its simplicity and because of its

However, the usefulness of your formats is going to depend on the
appropriateness of you macro and micro semantic layers. And since you want
all of your DTD to be small, specific, and strict, you will probably heed to
define several different small DTDs for different purposes.

> PS. The advantage of having explicit logical / structural
> markup (provided by DocBook) and the ability to
> check for well-formedness (provided by XML parsers)
> should not be understood --- whilst people can
> still make mistakes with mark up, or abuse tags,
> forget to apply tags, at least a certain set of mistakes
> can be weeded out at this stage rather than in the
> proofreading

Assuming that the word "understood" above should be "underestimated", notice
that with Word and Frame it is simply impossible to enter anything that is
not valid Word or Frame format. The power of a dedicated format is that you
can had a semantically aware editor that does not allow deviations from the
format to be entered. Thus catching such deviations at proofreading is a non
issue -- there never are any.

If you use Word as an ersatz XML editor by using styles to represent
elements, you can have the problem of failure to follow the format. However,
the solution here is to convert to XML and validate against the DTD, just as
it is if you work in an XML editor or a text editor. An XML editor can
enforce the DTD as you type, but that can be more of a hindrance than a
help, as if forces you to do all composition in DTD order.

In both Word and DOCBOOK you can have in-house policies about what subset of
available features of the format should be use in what order, and in both
cases people will cheat or forget, and the errors will have to be caught in
proofreading. Being so big and so general, DOCBOOK has lots of opportunities
for creating things that are valid DOCBOOK but not what you want.

The only way that you can make the validation features of XML really work
for you is to create languages that are small, specific, and strict.

All this said, please remember that my original point was not that DOCBOOK
is good or bad but that adopting DOCBOOK is not the same thing as adopting
XML. Whether DOCBOOK is good or bad, it is good or bad on its specific
merits as a format and an associated set of tools, and not on the merits of

What I find particularly puzzling about the tendency to equate DOCBOOK with
XML is that people have the exact opposite prejudice in regards to HTML.
People insist, to an exaggerated degree, that is you use HTML that you are
not using XML. HTML and DOCBOOK are both specific applications of SGML. Both
of them have been translated from SGML syntax to XML syntax. Both have their
specific properties as formats. Both have specific tool sets designed to
work with them. Adopting either one of them is not the same thing as
adopting XML.

> someone once said on comp.text.tex when
> a newbie asked about translating a Word doc to
> LaTeX, "you can't make a cow out of a hamburger".

I've always heard it as "making pigs out of sausages". In the case of Word
and LaTeX, however, it might be more like making cows out of pigs -- or
hamburger out of sausages.
Mark Baker
Stilo Corporation
1900 City Park Drive, Suite 504 , Ottawa, Ontario, Canada, K1J 1A3
Phone: 613-745-4242, Fax: 613-745-5560
Email mbaker -at- ca -dot- stilo -dot- com

This message, including any attachments, is for the sole use of the
intended recipient and may contain confidential and privileged
information. Any unauthorized review, use, disclosure, copying, or
distribution is strictly prohibited. If you are not the intended
recipient please contact the sender by reply email and destroy
all copies of the original message and any attachments.


sourcing tool for FrameMaker that lets you easily publish your content
online. No macro language required!

Mercer University's online MS Program in Technical Communication Management:
Preparing leaders of tomorrow's technical communication organizations today.
See or write George Hayhoe at hayhoe_g -at- mercer -dot- edu -dot-

You are currently subscribed to techwr-l as:
archive -at- raycomm -dot- com
To unsubscribe send a blank email to leave-techwr-l-obscured -at- lists -dot- raycomm -dot- com
Send administrative questions to ejray -at- raycomm -dot- com -dot- Visit for more resources and info.


RE: Help with Docbook: From: Chris Gooch

Previous by Author: RE: Help with Docbook
Next by Author: RE: Help with Docbook (an alternative)
Previous by Thread: Re: Help with Docbook
Next by Thread: Re: Help with Docbook

What this post helpful? Share it with friends and colleagues:

Sponsored Ads