Stop words are a nuisance. A Misc posting. Revelation (OpenInsight 16-Bit Specific)
At 11 DEC 2002 05:41:53AM Oystein Reigem wrote:
Is it possible to have the forum databases without stop words? Alternatively - is there a workaround for the problems stop words cause?
Here's an example of the kind of problem stop words cause:
I know I need a Knowledge Base article that contains the text
"ENG0805 Error on Windows 2000 Professional Workstations Running OpenInsight"
So I enter this whole quoted string as a search criterion. But since "on is a stop word the search fails.
It fails without telling me it's because "on is a stop word.
It doesn't even tell me there's something wrong with my query.
So I might conclude the document isn't there. Not good.
As it is now one must avoid using stop words in the search criteria. Else one gets the problem with missed hits exemplified above. Another kind of problem occurs when the stop word is cruicial to get the criteria precise enough. E.g, "log on" is much more precise than just "log".
(Btw - I think now the failure I reported in while beta testing the new Search was caused by stop words.)
I would seriously suggest stop words are removed and the databases re-indexed.
If that's not possible I suggest that possible workarounds are investigated.
- Like issuing a general or specific warning ("Search criteria contains stop word(s)", "'on' is a stop word")
- Like checking the Domino docs to see if a search on phrases containing stop words can be changed into a wildcard search, making the stop word of the criteria match any word (lower precision is usually better than missed hits).
At the very list publish a list of stop words so we can avoid them in our searches!
Btw - why isn't there a Misc category in this section? I prefer to post this message in the common section so everybody can read it, not just Works members?
- Oystein -