Re: XML & Technical Writers...

Subject: Re: XML & Technical Writers...
From: Mark Baker <mbaker -at- OMNIMARK -dot- COM>
Date: Tue, 14 Apr 1998 19:01:17 -0400

Deborah Ray wrote
>Following is a (VERY brief and cryptic!)
>rundown of what XML offers tech writers.

While acknowledging the disclaimer, I think this message requires some
substantial clarification.

>In addition to the data definition and exchange stuff that
>you mentioned, XML
>* Combines SGML's power and scope with HTML's ease of use,
> giving you the best of both technologies. You can think of
> XML as being "SGML-lite" or "HTML on steroids."

This is an frequently quoted description but it is very misleading. SGML and
XML are meta-languages. You use them to describe tagging languages. HTML is
a tagging language described by SGML.

XML is SGML lite in the sense that it allows fewer variations of syntax in
the markup languages it describes (it cannot, for instance, describe HTML)
but it is just as general as SGML. Any data representation problem you can
solve in SGML you can solve in XML.

Saying that XML is HTML on steroids, however, misses the point. HTML is a
tagging language. XML is used to create tagging languages. It is rather like
saying that the Latin alphabet is English on steroids. You can use the Latin
alphabet to create your own language (as the developers of Esperanto did)
but the alphabet is not a better language than English.

As to ease of use, tagging languages are easier to use than meta-languages
because tagging information is easier than inventing tagging languages. XML
is easier the learn than SGML because there is less of it. On the other hand
you can use the subset of SGML which is equivalent XML just as easily as you
can use XML. XML was designed to be easier for developers of XML processing
software. It takes months, if not years, to develop a half decent SGML
parser. It takes about a week to create an XML parser. XML has no particular
ease of use advantage for writers compared to SGML. Indeed, since SGML can
create more sophisticated syntax in a tagging language, an XML based
language may be harder to use than an SGML based one. A minor case in point
is HTML, which allows you to omit paragraph end tags. An XML based language
would force you to put them in.

>* Makes reusing information relatively easy and inexpensive--that is,
> you can develop information to include in a product's documentation
> and easily reuse it for training materials, product descriptions,
> or whatever. You don't have to deal with recreating information,
> changing formats, or addressing platform compatibility issues.

This is rather optimistic. XML can be part of content management system that
provides these benefits, but there are many other parts. Simply tagging
documents in a tagging language based on XML will not give you easy reuse,
anymore than doing the same with an SGML based tagging language does. And
you do have to deal with changing formats, because your XML based language
will not be directly displayable in any media. You will need to create an
application to process your tagging language into the format appropriate for
each media you address. You will probably also need to create applications
to translate between different XML based tagging languages used by different
parts of your company. And it is you that has to create these applications,
because it is you who created the tagging languages and so only you know
what the tags mean.

>* Lets you create documents that meet very specific needs--
> you create your own document structure and rules as well
> as your own tags and attributes.

This is right on, but consider the implications. The more specific to your
particular business needs you XML based language is, the more you will need
to provide your own processing application to process that language. The key
to successful and useful applications of XML or SGML lies in the quality and
sophistication of the processing applications that act on the markup.

>* Can help you develop consistent document sets. XML not only
> lets you organize documents and information very precisely, but
> it also can force developers to comply with the structure
> and organization outlined in the DTD you create.

True, with reservations. For one thing, a DTD (the XML statement which
defines your tagging language) can't express all the business rules you are
likely to want to achieve meaningful consistency. This again is the province
of the processing application that processes the markup created by
developers. Secondly, DTD's get big in a hurry and you will quickly find
that the number of possible combinations of things that are legal in your
DTD will make you consistency more technical than real. This is why we
promote the use of Microdocument Architectures(tm) to keep DTD's simple.

>* Is great for creating large document sets, especially ones that
> are developed by teams of people or developed over months or years.
> As mentioned above, XML can force compliance with the DTD (so even
> long, drawn-out projects can include consistent documents), and
> it's text-based, meaning that the information will be readily
> transferrable in the long term.

It has a role to play in this kind of project, but unless you commit to a
full and well designed content management system XML alone will not get you
very far.

>* Lets you identify contexts for words on the page--that is,
> this is a figure caption, this is a figure reference, etc.
> Because you create your own tags and attributes (and are not forced
> to use ones that the W3C specifies), you can specify exactly how
> words are used.

This is misleading. Neither XML, nor the tagging languages you create with
it, let you specify how a tagged element is used. All an XML or SGML based
tagging language gives you is a way to recognize tags, a list of valid tag
names, and rules about what order that tags can occur in. They say nothing
about what the tags mean. The meaning of tags can only be defined by a
processing application that acts on the tags in the way you want it to. You
can define a tag called <figcap> if you like, but as far as a browser is
concerned it might just as well be called <zxcvbn>. The fact that one name
is readable to people and the other is not has nothing to do with whether or
not the browser knows what to do with the tag when it sees it. If you create
a tag, you have to create an application that recognizes the tag and knows
what to do with it. It is the application that gives the tag meaning.

>* Lets you address a potentially large audience using a
> variety of platforms. As XML tools and technologies develop,
> XML should help eliminate cross-platform, cross-software,
> and browser-specific issues.

This too may be misleading. In itself, XML does not eliminate the
proliferation of formats, it contributes to it. (Everybody and his dog can
now create their own tagging language.) If browsers are to display XML
tagged pages, they will need a processing application to interpret the tags
into a form they know how to display (that is, HTML and CSS). You could
write such an application as a Java applet or a XSL script, but you have to
write it. The browser then becomes simply a host for the processing
applications that the creators of XML pages write. This will eliminate
browser specific issues in so far as the browser manufacturers agree to
create identical Java and XSL interpreters (anyone taking bets?). And, of
course, their implementations of HTML and CSS will have to be identical as
well for the processing applications that target them to perform
identically. My bet is that your Java or CSS apps will have to do browser
detection and adjust their output accordingly.

I think there will very few cases where it will make sense to ship an XML
file and a processing application to the browser rather than converting the
file on the server and sending HTML/CSS. The tools available for server side
XML conversion are far more powerful and easier to use for this purpose than
Java or XSL (okay, guess what my company makes).

>At this point, all signs indicate that XML offers the potential
>of being an ideal tool for tech writers to learn and use. In fact,
>tech writers are ideal candidates for using this technology because
>we already have the information development, design, and presentation
>skills necessary to develop these structured document formats.

This needs to be taken with a big grain of salt. Tagging languages are
useless without processing applications to define what the tags mean and to
actually do something useful with the data. Content management systems which
use XML can be very useful, but you have to look at the system as a whole.
XML alone will not do much for you.
Mark Baker
Manager, Corporate Communications
OmniMark Technologies Corporation
1400 Blair Place
Gloucester, Ontario
Canada, K1J 9B8
Phone: 613-745-4242
Fax: 613-745-5560
Email mbaker -at- omnimark -dot- com

Previous by Author: Re: Revising Evergreen Documents
Next by Author: Re: XML & Technical Writers...(long)
Previous by Thread: Re: XML & Technical Writers...
Next by Thread: Re: XML & Technical Writers...

What this post helpful? Share it with friends and colleagues:

Sponsored Ads