Fw: XML & Technical Writers...

Subject: Fw: XML & Technical Writers...
From: Mark Baker <mbaker -at- OMNIMARK -dot- COM>
Date: Thu, 16 Apr 1998 09:42:16 -0400

I forward this for John for the reasons he explains...

>Could you please post this for me. I am a member but the server has
>with my mail.
>--John Mackin
>Hi there! I would like to add my $0.02 worth (I apologize to our
>international participants for the use of this local idiom and an even more
>idomatic way of writing it) to the XML discussion.
>Mark Baker is making some very important points about understanding both
>scope and "type" of languages we are discussing.
>1. Type of languages
>Starting from the ground up, there are basically two kinds of languages:
>natural and "artificial"; but within the latter comes the key type of
>"formal languages."
>Natural languages find their existence only in spoken or written
>"instances." Linguists then abstract from these instances to define an
>"instance-creating language" which we then study in school under the lable
>of English, French, Japanese, or what have you. But the definitions are
>always instance-driven. Plato would say that they were one level "removed"
>from the instance. So here we have real instances and
>theoretical/abstracted definitions.
>Formal languages first find their existence in their formal definition;
>*may* then find further existence in instances. But they gain value only
>how much they are "instanced" (thus the low value ascribed to Esparanto).
>Here too we have definitions and instances. But there is a difference.
>Formal languages are not limited to languages that humans speak. And the
>definitions are not restricted to first-level definitions of instances. It
>is possible to conceive of definitions of definitions (second level
>or even definitions of definitions of definitions (third level removed) and
>so on.
>Take COBOL, FORTRAN, C++ for instance. These are first-level formal
>languages that are instanced in a "program." But other languages exist
>are used to define the definitions in COBOL, FORTRAN, C++, and so on.
>languages that define languages are second-level formal languages and are
>often called meta-languages to distinguish them from first-level languages.
>Examples are Bachus-Naur Format (BNF) and the "train-track/trolly-car)
>format often used in computer manuals.
>In our case, SGML/XML are second-level meta-languages. They define another
>first-level "language" (often mistakenly called SGML/XML) that is used to
>create instances/documents. HTML is an example of such a first-level
>language. In our field of documentation we may find the term "tagging
>language" suitable to describe such a first-level language (as Mark
>suggests). SGML/XML are not tagging languages: they define those tagging
>languages (in a DTD). So if we write a DTD for meeting minutes, we cannot
>call the resulting language SGML/XML. We who created it must give it a
>name, perhaps "minutes tagging language" just as HTML could be called a
>"HTTP-convention based homepage tagging language."
>So please, let's be clear whether we are describing a first-level or
>second-level language.
>2. Scope of language
>In documentation we want to control five aspects of our documents: content,
>structure, formatting, interrelationships, (sometimes) version/level
>The scope of SGML/XML is restricted to defining the structure of documents
>(and assigning some arbitrary "attributes"). The semantics of the tags and
>the attributes is outside the scope of the SGML/XML languages. This
>restriction adds to the flexibility and power of the two languages, but
>tends to confuse people because they naturally add meaning to
>character-strings that match "words" in their language. BUT SUCH SEMANTICS
>The designers of a DTD may arbitrarily assign "meaning" to the tags and
>attributes but those meanings will not be understood by a program that
>checks whether a given instance complies with the DTD (such an error
>is called a "parser.") Those meanings can only be realized by "application
>programs" that come after the parse phase. And only those who write the
>or want to use that DTD can write the application, because of the
>arbitrarily assigned "meanings." What are examples of such "meanings"?
>Formatting information such as placement, spacing, and size of font;
>content-type information such as "warning" or "error messages."
>HTML is a first-level tagging language with arbitrarily assigned
>All of the meanings are "format meanings," not "content meanings" except
>the two sections HEAD and BODY. But the meanings are not what the humans
>think they are. <P> does not mean "paragragh." It means "new line + skip
>one line + go to left margin" (for all the application programs/browsers)
>that I have seen). <H3> does not mean "header, level-3." It means "change
>font to font specified for <H3>." Homepage designers know that the
>meaning is not the real meaning and use this knowledge to create all those
>"cute" homepages that would make a TW person sick if he or she would read
>the real coding.
>To summarize, HTML is a tagging language that defines text and graphics and
>interrelationships, with little structure, no content. It has formatting
>conventions implemented (sometimes differently) in a number of popular
>application programs (browsers). There are certain documents that beg for
>such functionality, while others would be severly contrained by it. In
>other words, for certain circumstances, HTML and a browser are all you need
>to provide users with documents.
>SGML needs other standard languages to create a totally "international
>standard" environment. (You don't have to buy this concept of international
>standards. You can lock yourself into a proprietary standard. But that is
>another topic.) SGML is grouped with DSSSL for formatting and with HyTime
>for flexible linking of all kinds, No international standards presently
>exist for content control and version/level control.
>When SGML was invented, the Internet did not exist. Documents were not
>updated or viewed in realtime. So SGML was designed for a batch
>environment, not a realtime environment.
>XML is a variation of SGML (a few rules were eliminated and a few were
>changed) designed to work in a realtime environment. It is grouped with a
>different set of standards (W3C, not ISO). XLL is proposed for linking (it
>is a variation of HyTime) and XSL for formatting (and probably action
>scripting). No standards are planned for content control or version/level
>If you keep in mind the differences in type and scope of languages, the XML
>discussion will proceed faster and more accurately.
>--John Mackin
>Scope of languages
>At 11:32 98/04/15 -0400, you wrote:
>> Deborah Ray wrote:
>> >If you're wondering about whether browsers will end up
>> >offering competing XML extensions (or some other makes-
>> >it-hard-to-develop-usable-pages garbage)...probably
>> >not. Because XML is completely customizable, you (not
>> >the browser companies) are in charge of what features
>> >are available.
>> A word of clarification on extensibility. XML is not a tagging language,
>> is a language for defining tagging languages. XML itself is not user
>> extensible. Only the XML working group can extend XML.
>> Tagging languages written in XML are extensible. All tagging languages
>> extensible. It is a myth that HTML is not extensible. The browser wars
>> all about extensions to HTML. HTML is now at or approaching version
4.0 --
>> impossible if it was not extensible.
>> The point is, tagging languages are extensible by the people who create
>> them. More specifically, they are extensible by the people who write the
>> applications that process them (browsers, for example). (HTML 3.0 was
>> proclaimed by W3C, but it was a dud because the browsers did not support
>> HTML 3.2 was a reduced language the defined the tagging that the browsers
>> actually supported. The processing application is king.) So yes you can
>> customize your XML, because you write your XML based tagging language,
>> your own processing applications.
>> ---
>> Mark Baker
>> Manager, Corporate Communications
>> OmniMark Technologies Corporation
>> 1400 Blair Place
>> Gloucester, Ontario
>> Canada, K1J 9B8
>> Phone: 613-745-4242
>> Fax: 613-745-5560
>> Email mbaker -at- omnimark -dot- com
>> Web: http://www.omnimark.com
>* John Mackin, Fujitsu Learning Media, Limited *
>* <CALS, Technical Communication, Translation> *
>* jmackin -at- flm -dot- se -dot- fujitsu -dot- co -dot- jp *
>* TEL:+81-3-5762-8086 FAX:+81-3-5762-8074 *

Previous by Author: Re: XML & Technical Writers...(not too long!)
Next by Author: Re: XML & Technical Writers...
Previous by Thread: Re: XML & Technical Writers...
Next by Thread: Re: XML & Technical Writers...

What this post helpful? Share it with friends and colleagues:

Sponsored Ads