Clean HTML from Word?

Subject: Clean HTML from Word?
From: "Sandy Harris" <sharris -at- dkl -dot- com>
To: techwr-l -at- lists -dot- raycomm -dot- com
Date: Thu, 11 May 2000 12:06:37 -0400

I've got some HTML files that work fine as HTML, but I've been told to put them
into Word for compatibility with other documentation the client has and so that
they'll print out more prettily. Fine so far.

The docs will ship in two formats: PDF for printing and HTML for online use.
The company wants them maintained in Word. I can live with that, and Word-to-PDF
is not a problem.

My problem is that Word 2000's "save as HTML" gives me things like:

<h1 style="margin-left:0in;text-indent:0in"><a name="_Toc481815121"></a>
<a name="intro"></a><a name="_Toc482432224">
<span style="mso-bookmark:intro">
<span style="mso-bookmark:_Toc481815121">Section1.
<span style='font:7.0pt "Times New Roman"'></span>Overview</span></span></a>
<span style="mso-bookmark:intro"></span></h1>

when the original HTML was:

<h1><a name="intro">Overview</a></h1>

I can tolerate Word inserting an extra <a name="_Toc..."> since it seems not
to be smart enough to use my "intro" label, but I desperately want to get rid
of the rest of this crud so HTML style sheets can control how <h1> displays.

Word gives:

<p class="MsoNormal" style="margin-left:.5in;text-indent:-.25in;mso-list:
l0 level1 lfo7">
<strong><span style="font-family:Symbol;font-weight:normal">·
<span style='font:7.0pt "Times New Roman"'>        </span></span>
</strong><strong>empowering users</strong>
to better control their own applications.

where the original was:

<li>
<strong>empowering users</strong> to better control their own applications.
</li>

Here it not only inserts extraneous crud, but also loses the list structure
information in the original tagging.

Is there some piece of Word magic that will make it export clean HTML? I can
fix most of the mess with other tools, like HTML tidy (w3c.org), but I'd much
rather not have to.




Previous by Author: Re: Employee experience dilemma....
Next by Author: Re: TW Trade Schools..Do U?
Previous by Thread: THANKS!!!!!!
Next by Thread: RE: Clean HTML from Word?


What this post helpful? Share it with friends and colleagues:


Sponsored Ads