Saturday, May 18, 2013

Proximity Search

First ... definition.


In text processing, a proximity search looks for documents where two or more separately matching term occurrences are within a specified distance, where distance is the number of intermediate words or characters. In addition to proximity, some implementations may also impose a constraint on the word order, in that the order in the searched text must be identical to the order of the search query. Proximity searching goes beyond the simple matching of words by adding the constraint of proximity and is generally regarded as a form of advanced search.
For example, a search could be used to find "red brick bed and breakfast", and over 2.3 million hits ... and I'm no closer to finding what I really want.  By limiting the proximity, these phrases can be matched while avoiding documents where the words are scattered or spread across a page or in unrelated articles in an anthology.


Back in the day, AltaVista was THE search engine for the uber-geek ... and the main feature it had was a "proximity search" call ... NEAR ... and man does that help a lot.  

Google has a very much unadvertised search term ... AROUND(n) ... and it's a proximity detector (and I'm oscillating as I write this because it will make my search easily TEN TIMES more accurate).  AROUND has several features I need to document.   First, the (n) is the number of words to look NEAR the terms.  Also, if you put QUOTES around the whole search term, you're telling Google that the first term has to come before the second term.

From "www.netforlawyers.com", the Google proximity connector AROUND(n) must be placed in upper case as illustrated in this explanation. By replacing the “n” with a number, you determine how many words you want your keywords to be from each other. For example, when we searched for:
     carol levitt
we got 844,000 hits.
When we search for:
     carole AROUND(2) levitt,
we retrieved 281,000 results where carole was within 2 words of levitt. While better, the results consisted of results like this:
  • Carole A. Levitt
  • Carole Ann Levitt
  • Carole Levitt
  • Nancy Jo Levitt
  • Carol Suzanne Levitt (notice the name carol was retrieved)
  • Alain Levitt, Carol Lim
  • Joseph Gordon-Levitt inexplicably performing Carole King's classic “You Make ...
To conduct an even more targeted search, try enclosing your proximity search within quotation marks. Our search for     “carole AROUND(2) levitt”
retrieved 22,600 results. This search only retrieved results where the exact name carole preceded the exact name levitt (with up to two words in between carole and levitt). It disregarded any documents that included various spellings of carole and levitt while the earlier search, without quotation marks, retrieved results with spelling variations for carole and levitt.




No comments:

Post a Comment

Search This Blog

My Passions