Re: How to compile a list of unique words in a file?

Top Page
Attachments:
Message as email
+ (text/plain)
Delete this message
Reply to this message
Author: Jeremy C. Reed
Date:  
To: PLUG
Subject: Re: How to compile a list of unique words in a file?
On Wed, 11 Aug 2004, Josef Lowder wrote:

> On a previous *nix system, I had a little shell script named 'unique'
> (contents shown below the line) that would compile a list of all the
> unique words in a named text file. But when I tried to use it on my
> current Linux system today, I discovered that I do not have 'deroff'
> which is required for 'unique' to work.
>
> Is there some other way to compile such a list of unique words;
> or is 'deroff' something that could be downloaded from somewhere?
> I googled for it but couldn't find anything that I could download
> among the 2,000+ hits I got.


deroff is used to remove ROFF codes (nroff/troff, eqn, pic and tbl
constructs) from files. For example, it can remove man page nroff codes.

I don't think you need it for looking at normal text files.

If you do want it, BSD-licensed code for it is at
http://www.openbsd.org/cgi-bin/cvsweb/src/usr.bin/deroff/

> Also, even more to the point ... is there some kind of Linux utility
> that will compile an index of words and their corresponding page
> references within a given text file?


Your example below doesn't seem to include "corresponding page
references". Are you wanting the page numbers for the words? Off the top
of my head I don't know of any, but I am sure some are available for use
with ROFF, LaTEX or DocBook files for generating keyword indexes.

> : unique -- to find and list all the unique words in a text file
> # syntax: unique filename (or pipe to a new file)
>
> deroff -w $1 | sort -uf > word.list


Maybe try:

fmt 1 your-file | tr -d [:blank:] | sort -uf | less



Jeremy C. Reed

                 BSD News, BSD tutorials, BSD links
                http://www.bsdnewsletter.com/



---------------------------------------------------
PLUG-discuss mailing list -
To subscribe, unsubscribe, or to change you mail settings:
http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss