How do I index thee? Let me count the ways.

Subject: How do I index thee? Let me count the ways.
From: Sandy Harris <pashley -at- storm -dot- ca>
To: "TECHWR-L" <techwr-l -at- lists -dot- raycomm -dot- com>
Date: Fri, 02 Aug 2002 12:50:55 -0400


The recent discussion of single-sourcing and related
issues brought this to mind, but the questions I'll
ask are also valid outside that context.

I think it quite important to provide multiple paths
to information. Users may have different approaches,
styles, needs, ... Different problems may require
different solutions and at write-the-manual time, we
may not even know the problems.

So what are our options?

The obvious, standard ones for printed docs are
table of contents and index.

For online help, we also have context-sensitivity.
The application directs the user to some part of
the help text, depending what screen he's on or
some such.

In any environment, documentation for anything
more complex than a flint axe will probably use
cross-references.

In my experience, we often get parts of these
basic things wrong.

Indexes must include entries for synonyms,
so a user who is used to different terminolgy
can find things.

'help' text needs to be well organised, and
delivered with at least a good table of
contents (rare), and preferably an index

Any online documentation should be amenable
to full-text search, if only as last resort
for desparate users. If your combination
of format and tools don't allow that, you
should change them.

However, the interesting question is: once
we've got the basics right, what else is
there to consider?

We may want to provide tools to make text
seraches convenient for our users, and/or
to build some sort of index to make searches
faster and cheaper. If one delivery format
is HTML, those tools are free.

Tagging chunks of documentation with keywords
can be a big win. Searching or indexing by
keyword is far more efficient than full text.
Depending on the problem at hand and on choice
of keywords, it may also be more effective.

In a large shop, you may need to develop a
list of standard keywords, and have editors
checking for consistent usage. Inconsistently
used keywords are not nearly as useful.

If you're putting your docs on the web, using
keywords helps the search engines find them.
Even if you're not making them public, it helps
almost any HTML indexer you or your customer
might choose to use with them. If your docs
are stored in a database, keywords are an
obvious way to index them.

For some types of documentation, something
like an abstract may be useful. Perhaps one
per chapter or procedure? Full text search
on a set of abstracts is far more efficient
than on the whole text body. This is true for
eyeball search as well as computer search.

For the user checking results of a computer
search, it is far easier to scan an abstract
to see if a doc section solves his problem
than to be dropped into the middle of a
document (several times if the search gave
multiple hits on one doc) and have to
navigate from there.

For finer-grained docs -- for example long
lists of API functions or jet engine parts
-- you may want a one-line summary for each
section, not a paragraph-long abstract.

For example, Unix man pages have a one line
summary for every command or function or
file format documented. Examples:

grep, fgrep, egrep - print lines matching a pattern
chmod - change file access permissions

The first page documents three related commands.

Users can search these summaries with the
command 'man -k' or 'apropos', depending
on their Unix version.

For example, on my machine, 'man -k permission'
gives, among other things:

What I was looking for; the user command:
chmod (1) - change file access permissions

Manual section 2, system calls a programmer could use
to change or check permissions:
access (2) - check user's permissions for a file
chmod (2) - change permissions of a file
fchmod [chmod] (2) - change permissions of a file
ioperm (2) - set port input/output permissions

Section 8, administrative commands:
pam_console (8) - control permissions for users at the system
console
pam_console_apply (8) - set or revoke permissions for users at the
system console

and section 5, file formats:
console.perms [console] (5) - permissions control file for users at the
system console

This isn't an ideal interface, but it can be useful.
Just testing it for this message, I found something
I didn't know about, ioperm(2).

Pretty good for a 20-year-old design. I wonder how
it might be improved with modern tools?

Older versions of Unix went further with these one-liners,
generating a permuted index from them. Some more recent
versions have (stupidly, in my view!) dropped this.

A permuted index reformats lines like those for
'permissions' above, so you get something like:

change file access permissions chmod(1)
set port input/output permissions ioperm(2)
check user's permissions for a file access(2)
change permissions of a file chmod(2)
change permissions of a file fchmod(2)

This is done for every word in the line (except 'the', 'of'
and other grammattical words) so the several of the above
lines also turn up under 'file':

.. permissions for a file access(2)
.. permissions of a file chmod(2)
.. permissions of a file fchmod(2)
change file access permissions chmod(1)

I wrote programs at one point to generate a permuted
index from HTML headers in a doc set. Sample output:
http://www.freeswan.org/freeswan_trees/freeswan-1.5/doc/perm.html.html

This wasn't great, but it was useful to some readers.


^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Save up to 50% with RoboHelp Deluxe. Get 2 great products for 1 low price!
You'll get RoboHelp Office PLUS RoboDemo, the software demonstration tool
that everyone's been talking about. Check it out and save!
http://www.ehelp.com/techwr-l

TECHWR-L is supported by ads and sponsorships...and donations.
You can help maintain the TECHWR-L community with donations
at http://www.raycomm.com/techwhirl/abouttechwhirl/donate.html

---
You are currently subscribed to techwr-l as:
archive -at- raycomm -dot- com
To unsubscribe send a blank email to leave-techwr-l-obscured -at- lists -dot- raycomm -dot- com
Send administrative questions to ejray -at- raycomm -dot- com -dot- Visit
http://www.raycomm.com/techwhirl/ for more resources and info.


Previous by Author: Re: On-line vs. print (WAS: Of myth and reality)?
Next by Author: Re: Toward a more civil techwr-l community
Previous by Thread: Re: Dan, Dan, the screenshot man
Next by Thread: Coincidence?


What this post helpful? Share it with friends and colleagues:


Sponsored Ads