Natural Language Search

Powerset (http://www.powerset.com) looks interesting. At present the company is in “semi-stealth” mode. Gathering investment as well as developing their natural language search engine.

Powerset’s Barney Pell has someinteresting stuff to say on the topic.

http://www.barneypell.com/archives/2006/10/powerset_and_na.html

Well worth a read, some interesting concepts with respect to the way that search engines work nowadays.

Search engines are keyword based and at their heart are really just boolean based searches against their index. They take your search term and stripout out what are known as stopwords leaving just the keywords. Stopwords are words such as a, about, from, of, for and the like, these would only complicate the results of a boolean search as they are such common words.

Pell and gang demonstrate that in some searches these words are acutally useful. For example, take these three search terms:

  • Books for children
  • Books about children
  • Books by children

When for, about and by are all stripped out we are only left with Books children, and the search engine cannot distiguish between the three quite different purposes of the queries.

Pell says that we are all searching with an impovourished, pidgin english at present, I for one would welcome a more natural approach at times. I’m sure like me, many of you have sometimes come across a particular search that never seems to get the results that you’re after, or at the most it takes a long time to get the right string of keywords and advanced search options. Imagine what searching is like for the less techinacally minded out there who don’t speak keywordese. NL searching, if it works and is marketted well to that larger group of people, could be very successful.

If though, when it launches, it doesn’t have a toolbar-esque plugin then I will find it very difficult to remember to use it. When I want something my mouse cursor always goes straight for my Google toolbar.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>