Mailing List Archive

Mailing List: techdiver

Banner Advert

Message Display

Date: Thu, 3 Oct 1996 19:47:10 -0400
From: rnf@sp*.tb*.co* (Rick Fincher)
To: techdiver@terra.net
Subject: Filtering messages
Hi All,

Several people have posted lately commenting on the fact that George's messages 
often contain pearls of wisdom gleaned from hard won experience, but frequently 
contain diatribes filled with scatalogical references and little or no 
information that furthers their knowledge of diving.

If you are like I am you have a limited amount of time to read email and would 
like an automated way of separating the wheat from the chaff so that you don't 
spend a lot of time manually sorting the messages out by reading through them 
all.

A simple kill file will prevent all of George's mail from getting through, but 
that also eliminates all the useful info.

I pondered the problem for awhile and analyzed some of George's messages and 
came up with an algorithm to do the job in an automated fashion.

The general idea is to scan the message looking for key phrases. These phrases 
fall into four basic categories. These are:

Denigration
Homophobic
Hot Button
General

The Hot Button phrases get the most points because they are most likely to get 
George's blood pressure up and set off a tirade.

Create a file on your system for each category and put the known phrases into 
the proper file. So far I have:

DENIGRATION:
bone smuggler
stroke
suck my dick
SMD
shut the fuck up
stfu
dumb fuck

HOMOPHOBIC:
anus lapper
transformer

HOT BUTTON:
butt mount light
Genesis tanks
square battery case
prismatic battery case
Dacor Scooter
magnetic switch
breathe the short hose

GENERAL:
sushi
Lucys
knee pads
pet the pony

Your filter program simply scans the mail message for the phrases in each 
category and assigns a point value to each. If the points racked up by a
message 
goes over a preset value it is put in a hold file for later review. Sentences 
with more than one phrase in them get extra points.

As George comes up with new phrases or as you tire of reading the same old 
flames every time some new guy comes on the list and posts a message like:

"I'm having some problems butt mounting my light and was wondering if..."

Add the appropriate phrase to the proper file and you won't be bothered with it 
any more.

At first I broke each phrase down into nouns and adjectives but after further 
thought I decided that it didn't really matter whether you were an "anus
lapping 
bone smuggler" or a "bone smuggling anus lapper". If the root form of each 
phrase is stored in the files your pattern matching software uses, you can
catch 
either form. For example "anus lapp" will match either the noun phrase "anus 
lapper" or the adjective phrase "anus lapping".

One problem I had was with phrases having double meanings. Some examples are:

butt mounter
weenie hoover
short hose

A "butt mounter" could go in the Hot Button file as one who butt mounts lights, 
or it could go in the Homophobic file as one who mounts butts.

Similarly a "weenie hoover" could be simple Denigration as one who sucks a lot 
of air on weenie dives, or in the Homophobic file as one who sucks weenies.

Also a "short hose" could be someone who either breathes the short hose or 
someone with a short penis.

It is difficult for an automated program to determine which meaning these 
phrases have, but in these three cases it didn't really matter because either 
definition was equally bad. Just be sure and put each phrase in only one 
category so it won't be double counted.

Since I'm in a UNIX environment I used a simple shell file with awk and sed to 
do the pattern matching but you could use your favorite programming language or 
database to do the job or even macros in a word processor using its "find" 
command.

I'm able to read through techdiver mail much more quickly now, and if it
appears 
I have missed a message of substance I can retrieve it from the hold file for 
review.

By tuning the point values you set for each phrase you can filter out the 
messages that annoy you 90% of the time while rarely missing a good one.

If I missed any good phrases please post them to the list.

Please direct flames and hate mail to /dev/null.

Hope this helps.

Rick

Navigate by Author: [Previous] [Next] [Author Search Index]
Navigate by Subject: [Previous] [Next] [Subject Search Index]

[Send Reply] [Send Message with New Topic]

[Search Selection] [Mailing List Home] [Home]