RE: converting word to html

Subject: RE: converting word to html
From: Jean Weber <jean -at- wrevenge -dot- com -dot- au>
To: "TECHWR-L" <techwr-l -at- lists -dot- raycomm -dot- com>
Date: Mon, 10 Dec 2001 07:12:19 +1000

Darren Barefoot wrote about

a plug-in from the MS site which allows then to generate vanilla (I think MS
calls it "plain") HTML.

The plug-in he mentions creates much cleaner HTML than the "round-trip" save-as-html supplied with Word 2000, but the results are still not very clean (typically containing, for example, many extraneous font tags). For many purposes, this result suffices; but if you want really clean HTML, here's the secret (extracted from my technical editors' newsletter of 21 January 2001, http://www.jeanweber.com/news/tenews45.htm ):

The secret to keeping junk out of Word-created HTML is to not
convert from a DOC file to an HTM file. Instead, create a file
based on the HTML.DOT template which is supplied with Office97
(at least it was in my copy). The easiest way to do this is to
choose File > New which open the New dialog box that lists all
the templates. On the Web Pages tab, choose Blank Web Page.

Make sure you use ONLY the H1, H2, H3 etc styles for headings
(NOT Heading 1, Heading 2, etc) and Normal for everything else.
For a list, first type the list items, then select all of them
and click either the Number or Bullet button on the toolbar.
For bold, italic, or underlined words or phrases, select the
word or phrase and click the relevant toolbar button. Do NO
manual font/character formatting except bold, italic, or
underline. Do NO manual layout formatting at all. I tried
some simple tables and they worked fine; I haven't tested this
method with complex tables.

When you save the document, it is saved as an HTM file, not a
DOC file. Open the HTM file in NotePad and you'll see that
it's clean (except for some META junk at the top of the file,
easily stripped out).

-----
As for stability, Robyn - I'm not quite sure what you mean. Whether links work or not depends on several factors, none of which (AFAIK) has to do with how the file is saved to HTML. The page itself may display differently in different browsers (sometimes trivially, sometimes dramatically), and some Microsoft-specific features may not work in all browsers, but again AFAIK that has nothing to do with how the file is saved. If I'm wrong about that, I'm ready to be enlightened differently by other members of this list.

Regards, Jean
Jean Hollis Weber
mailto:jean -at- jeanweber -dot- com
The Technical Editors' Eyrie http://www.jeanweber.com/

> -----Original Message-----
> From: Robyn Schell [mailto:robyn_schell -at- dccnet -dot- com]
>
> Hello everyone, are files converted from Word to HTML using
> Save As>Web Page (2000 version) stable on the web? Are links
> made from this new web page using FrontPage stable on the
> web? Or is there another better and relatively quick way to
> do this? I'm coming up with a system for non-technical types.


^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Collect Royalties, Not Rejection Letters! Tell us your rejection story when you submit your manuscript to iUniverse Nov. 6 -Dec. 15 and get five free copies of your book. What are you waiting for? http://www.iuniverse.com/media/techwr

Have you looked at the new content on TECHWR-L lately?
See http://www.raycomm.com/techwhirl/ and check it out.

---
You are currently subscribed to techwr-l as: archive -at- raycomm -dot- com
To unsubscribe send a blank email to leave-techwr-l-obscured -at- lists -dot- raycomm -dot- com
Send administrative questions to ejray -at- raycomm -dot- com -dot- Visit http://www.raycomm.com/techwhirl/ for more resources and info.


Follow-Ups:

Previous by Author: Re: Word and printing problems
Next by Author: Re: On Avoiding Gender Bias
Previous by Thread: RE: converting word to html
Next by Thread: RE: converting word to html


What this post helpful? Share it with friends and colleagues:


Sponsored Ads