Re: How to do indexng for pdf files ?

Subject: Re: How to do indexng for pdf files ?
From: "William Turner" <whturner -at- earthlink -dot- net>
To: "TECHWR-L" <techwr-l -at- lists -dot- raycomm -dot- com>
Date: Tue, 1 Apr 2003 10:36:5 -0800


As I only subscribe to the TECHWR-L digest, I am a little slow in joining
these discussions. I noticed that Elizabeth Fullerton wrote yesterday
about having seen Acrobat Catalog used to create an index of PDFs.
However, her instruction "Make one PDF document a TOC, and then all the
files that are linked to that TOC can be catalogued" might be misleading,
so I will embellish.

While a PDF TOC (table of contents in PDF format) is both a good way to
introduce a collection of information and a good way to link to it, a TOC
has nothing to do with using Acrobat Catalog (a facility included in every
release of the paid version of Acrobat/Acrobat Exchange).

To index a PDF or set of PDFs, in Acrobat 5.0 you select Tools > Catalog.
Then you use the resulting dialog (which is confusing) to identify the
target PDFs and to set your preferences for the level of indexing detail
(e.g., case sensitive, "sounds like", exclude particular words). When you
click Build, Acrobat Catalog creates an index that you can only query
through the Advanced Search function of Acrobat or Acrobat Reader. While
the downside of that method is that you can't browse the index, the upside
is that you can identify multiple indexes to search at the same time. That
serves your purposes in creating a collection of thesis abstracts, because
you have the option of providing separate indexes of abstracts and of
theses. You can also choose from where to make them available, both from
the same place (on CD or on a network drive) or from separate places. You
could set up each abstract to include a formal metadata record - I suggest
considering the use of the XML Resource Description Framework (RDF) - so
that users could structure their queries more precisely.

The Advanced Search interface of Acrobat is almost as confusing.
Basically, you click the binocular button to access a dialog for selecting
indexes to search and for entering your search criteria. The results
display in a box labeled Search Results, which provides a hyperlinked list
of documents related to your search criteria.

As Elizabeth described, I have created PDF TOCs linked to customer
documentation, using that TOC as the intro file on a doc CD, while also
including an Acrobat Catalog index on the CD of the documents included on
the CD (and also including instructions on using the index). I have also
used Acrobat Catalog to index TECHWR-L digests for my offline use. Also,
using Acrobat's Web Capture tool, I have converted websites into PDFs, and
then indexed them (although I don't do that much now with Google having
become so good).

--Will Turner
whturner -at- rocketmail -dot- com

Purchase RoboHelp X3 in April and receive a $100 mail-in
rebate, plus FREE RoboScreenCapture and WebHelp Merge Module.
Order here:

Help celebrate TECHWR-L's 10th Anniversary starting this month!
Check out the contests at
Happy birthday to you, happy birthday to you, happy birthday TECHWR-L....

You are currently subscribed to techwr-l as:
archive -at- raycomm -dot- com
To unsubscribe send a blank email to leave-techwr-l-obscured -at- lists -dot- raycomm -dot- com
Send administrative questions to ejray -at- raycomm -dot- com -dot- Visit for more resources and info.

Previous by Author: RE: Tables and Figures. Why not call them both illustrations?
Next by Author: Re: Glossary Insertions?
Previous by Thread: The evils of Helvetica
Next by Thread: RE: How to do indexng for pdf files ?

What this post helpful? Share it with friends and colleagues:

Sponsored Ads

Sponsored Ads