RE: Searching Text Files

Subject: RE: Searching Text Files
From: "Inbar, Paul" <paul -dot- inbar -at- intel -dot- com>
To: "Ed" <glassnet -at- gmail -dot- com>, "techwr-l List" <techwr-l -at- lists -dot- techwr-l -dot- com>
Date: Tue, 23 Sep 2008 14:10:55 +0300

Hi Ed,

Several things to pay attention to/watch out for:

Does the text between the tags consist exclusively of alphanumerical
characters?

Are the open and close tags always on the same line? (I believe grep is
line-based, so you have to be sure the items you want to match always
occurs in within a single line. In Perl you can do pattern matching
across lines.

Can there is be more than one set of tags on each line?

Regarding getting rid of all the text except what is between the tags,
in Perl you can "grab" the match and then just print that.

Regarding sorting and getting rid of duplicates, don't forget the
utilities "sort" and "uniq".

Good luck!
Paul

-----Original Message-----
From: techwr-l-bounces+paul -dot- inbar=intel -dot- com -at- lists -dot- techwr-l -dot- com
[mailto:techwr-l-bounces+paul -dot- inbar=intel -dot- com -at- lists -dot- techwr-l -dot- com] On
Behalf Of Ed
Sent: Tuesday, September 23, 2008 1:32 PM
To: techwr-l List
Subject: Re: Searching Text Files

The combination of find and grep looks most promising. I started with
Dossy's snippet and modified to:

find . -name \*.xml -exec grep "<acroterm>"[A-Za-z0-9]*"</acroterm>"
"{}" \; > test1.txt

That produces a text file with all lines that contain acroterm tags.
The next step will require deleting all of the text on each line
EXCEPT what is between those two tags.

Sorting the acronyms and deleting duplicates is desirable too.

BTW, Powergrep is useful, but I used Astrogrep to test the regex on my
Windows box.

Thanks.
ED
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

ComponentOne Doc-To-Help gives you everything you need to author and
publish quality Help, Web, and print content. Perfect for technical
authors, developers, and policy writers. Download a FREE trial.
http://www.componentone.com/DocToHelp/

True single source, conditional content, PDF export, modular help.
Help & Manual is the most powerful authoring tool for technical
documentation. Boost your productivity! http://www.helpandmanual.com

---
You are currently subscribed to TECHWR-L as paul -dot- inbar -at- intel -dot- com -dot-

To unsubscribe send a blank email to
techwr-l-unsubscribe -at- lists -dot- techwr-l -dot- com
or visit
http://lists.techwr-l.com/mailman/options/techwr-l/paul.inbar%40intel.co
m


To subscribe, send a blank email to techwr-l-join -at- lists -dot- techwr-l -dot- com

Send administrative questions to admin -at- techwr-l -dot- com -dot- Visit
http://www.techwr-l.com/ for more resources and info.

---------------------------------------------------------------------
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

ComponentOne Doc-To-Help gives you everything you need to author and
publish quality Help, Web, and print content. Perfect for technical
authors, developers, and policy writers. Download a FREE trial.
http://www.componentone.com/DocToHelp/

True single source, conditional content, PDF export, modular help.
Help & Manual is the most powerful authoring tool for technical
documentation. Boost your productivity! http://www.helpandmanual.com

---
You are currently subscribed to TECHWR-L as archive -at- web -dot- techwr-l -dot- com -dot-

To unsubscribe send a blank email to
techwr-l-unsubscribe -at- lists -dot- techwr-l -dot- com
or visit http://lists.techwr-l.com/mailman/options/techwr-l/archive%40web.techwr-l.com


To subscribe, send a blank email to techwr-l-join -at- lists -dot- techwr-l -dot- com

Send administrative questions to admin -at- techwr-l -dot- com -dot- Visit
http://www.techwr-l.com/ for more resources and info.


References:
Searching Text Files: From: Ed
Re: Searching Text Files: From: Ed

Previous by Author: RE: Searching Text Files
Next by Author: Re: Google Doc in Comic Book Form
Previous by Thread: Re: Searching Text Files
Next by Thread: Re: Searching Text Files


What this post helpful? Share it with friends and colleagues:


Sponsored Ads