TechWhirl (TECHWR-L) is a resource for technical writing and technical communications professionals of all experience levels and in all industries to share their experiences and acquire information.
For two decades, technical communicators have turned to TechWhirl to ask and answer questions about the always-changing world of technical communications, such as tools, skills, career paths, methodologies, and emerging industries. The TechWhirl Archives and magazine, created for, by and about technical writers, offer a wealth of knowledge to everyone with an interest in any aspect of technical communications.
Subject:Re: Word Count Question From:Tom Brophy <tom -at- TCRAFT -dot- COM> Date:Thu, 25 Feb 1999 23:16:28 -0000
> The most probable reason for the difference is that I am just
> running Microsoft Word's Tools>Word Count function, and I am
> not counting words in footnotes (this is for help projects).
Your translator does have reasonable grounds for complaint. There can be quite an amount of text in $ and K footnotes (especially if you've been diligent in your indexing).
> Do any of you have any suggestions for counting
> words in footnote codes--and even as specific as counting
> only words with specific footnote codes? I certainly don't
> want to count words by hand, but I do need to feel comfortable
> about the differences.
From the HelpQA 2.0 manual ...
The Word counts section contains word counts for the CNT file and the source RTF files including and excluding footnotes, and in two "varieties", Microsoft Word compatible and the HelpQA-translatable word count.
Microsoft Word uses a generic word counting algorithm, identical to that used in UNIX?s "wc", which is quite suitable for most purposes. However, when word-counting text for translation, a more accurate means is required. Specifically, there are two problems with this algorithm:
· It counts the number of word breaks in a stream of text, not the number of words. For example, a typical keyword footnote might contain the text "file: open;OpenFile;open: file". When word counting this, Word finds two spaces and so, returns a word count of three. HelpQA will scan the text stream until it encounters a ?letter? (a normal unaccented alphabetic or numeric character, an underscore, an ampersand or a character whose ANSI code is greater than 192), scans forward until it reaches a non-letter and then repeats, counting one word for each block of ?letters? read. In the above string, HelpQA will return the correct word count of five.
· The "wc" algorithm also counts the number of words in the non-localizable footnotes as well as the footnote marker characters in the body text and footnote window. HelpQA ignores both.