Re: The sky is starting to fall again

Subject: Re: The sky is starting to fall again
From: keith -at- soltys -dot- ca
To: "TECHWR-L" <techwr-l -at- lists -dot- techwr-l -dot- com>
Date: Wed, 27 Apr 2005 13:24:31 -0600

As far as I know, WordML is a functionally complete description of the
Word document format in XML. It's been stated on the word-pc mailing list
(by someone from Microsoft if I remember correctly), that saving a Word
file as XML and then converting back to .doc format is a good way of
removing corruption in a Word document (bad binary indexes and the like),
but you don't lose anything in the file. I have done this with fairly
complex documents and have not found anything to be lost.

MS have published the spec for WordML on the web, and it's pretty well
documented in books like Office 2003 XML. I suppose you could write a DTD
or schema if you wanted to, and you could probably write documents in
native WordML in a text editor too, but it is a very verbose format, even
for XML.

Unfortunately, WordML has limitations regarding nesting of elements and
the like which make it difficult, if not impossible, to covert a Word
document to another XML based schema, such as DocBook. I had a post on my
weblog some time ago
that links to a UseNet article which goes into detail about this.

Essentially what it boils down to is that WordML was created to preserve
the format of the document, not the structure.


> eric -dot- dunn -at- ca -dot- transport -dot- bombardier -dot- com wrote:

> The author never quite gets around to the thing everyone wants to
> know: does WordProcessingML constitute the opening of the MS Word
> file format? He *does* say that saving native MS Word to
> WordProcessingML creates crazy-quilt XML, but is hopeful this will
> improve. What's not said is whether users can -- perhaps with
> difficulty -- sort-out a de facto DTD for WordProcessingML and
> create fully-featured MS Word documents with a text editor or
> programmatically, as can be done with OpenOffice files. That would
> mean universal interchange is/will be possible, but it's not clear
> to me that that's even a Microsoft goal. I suspect it's not.
> The test is whether there'll be a public namespace and DTD(s).


Accelerate the document lifecycle with full online discussions and unique feedback-management capabilities. Unlimited, efficient reviews for Word
and FrameMaker authors. Live, online demo:

You are currently subscribed to techwr-l as:
archiver -at- techwr-l -dot- com
To unsubscribe send a blank email to leave-techwr-l-obscured -at- lists -dot- techwr-l -dot- com
Send administrative questions to lisa -at- techwr-l -dot- com -dot- Visit for more resources and info.

Previous by Author: Re: Advice for a New Tech Writer
Next by Author: Images display with turquoise background in Word
Previous by Thread: Re: The sky is starting to fall again
Next by Thread: Re: The sky is starting to fall again

What this post helpful? Share it with friends and colleagues:

Sponsored Ads