Clean up Word tags in HTML / Dreamweaver?

Subject: Clean up Word tags in HTML / Dreamweaver?
From: Geoff Hart <ghart -at- videotron -dot- ca>
To: "TECHWR-L" <techwr-l -at- lists -dot- raycomm -dot- com>
Date: Thu, 29 Jul 2004 08:46:47 -0400

Christina Lamkin reports: <<I am trying to reduce the tag/dross bloat that results from saving a doc as HTML, in Word.>>

You didn't mention which version of Word you're using, but it's worth having a look at the Microsoft site to see whether there's an updated HTML filter for your version. Microsoft introduced at least one improved version to fix the worst of the export problems, but even with the filter, Word still exports ugly HTML. Newer versions of Word may also have a Options/Preferences setting that lets you customize how the export works (e.g., not specifying font tags).

That being the case, on to step 2:

<<The Dreamweaver Clean Up WordHTMl function is not successfully removing mso tags and others. (I am getting an error msg, in DW, when I start the conversion: Word file version not determined..." -- but DW selects the correct Word version anyway. Who knows how to either get DW back on track or (in any editor) cleanly remove Word-only tags? The reason is that the file is slow-to-open in a browser. --thanks.>>

There are two approaches. With small documents, I tend to copy the text out of Word and paste it into a new document in Dreamweaver, then manually reapply the formatting. Works fine, and not too time-consuming.

For longer documents, use Dreamweaver's own built-in editor: Find the first offensive tag (e.g., a FONT tag, often with a lengthy description of the typeface etc. following the word "FONT"), then under the Edit menu, select Replace. Copy the tag details (e.g., "FONT face=Arial size=200" etc.) into the Find What field, and in the Replace With field, delete anything already in that field (i.e., leave it entirely blank). Cick the Replace All button.

Dreamweaver now helpfully highlights all cases where you've removed only one of a pair of tags, leaving the other tag unaccompanied. Delete those tags the same way you deleted the original tag. Continue scrolling through the document looking for tags you no longer need, and repeat this global replace until you're done. A bit time-consuming, but works quite well on the whole because there's usually only a dozen or two tags you need to eliminate, and thus, only a dozen or two search and replace operations required.

--Geoff Hart ghart -at- videotron -dot- ca
(try geoffhart -at- mac -dot- com if you don't get a reply)


ROBOHELP X5: Featuring Word 2003 support, Content Management, Multi-Author
support, PDF and XML support and much more!

WEBWORKS FINALDRAFT: New! Document review system for Word and FrameMaker
authors. Automatic browser-based drafts with unlimited reviewers. Full
online discussions -- no Web server needed!
You are currently subscribed to techwr-l as:
archiver -at- techwr-l -dot- com
To unsubscribe send a blank email to leave-techwr-l-obscured -at- lists -dot- raycomm -dot- com
Send administrative questions to ejray -at- raycomm -dot- com -dot- Visit for more resources and info.

Previous by Author: The STC and me? (Take II)
Next by Author: CD Life?
Previous by Thread: Results of the Bottom Line
Next by Thread: Re: Clean up Word tags in HTML / Dreamweaver?

What this post helpful? Share it with friends and colleagues:

Sponsored Ads