Features – Search Engines Comparison 2001

Diana Botluk is a lawyering skills instructor at the Catholic University of America School of Law in Washington, D.C., and is the author of the The Legal List: Research on the Internet. She teaches legal research at CAPCON, Catholic University Law School, and the University of Maryland. Take a class with Diana! Here’s how…


At first glance, using a general search engine to locate information on the web seems easy. But getting a search engine to work with precision is another story. General search engines come packed with features that are often underutilized, but can be helpful in increasing search precision. The features differ from engine to engine, and skilled researchers will adjust their search strategy to take advantage of these differences depending on the type of results sought. This article will explain the differences in some of the available features, then examine a few major search engines in light of these features.

Searching Features

Alternative/Inclusive Default

When you type two words into a search engine box without any connectors, how does the engine put them together? Will it find only those pages where both words appear, or will it find pages where either word appears? Search engines with an inclusive default treat two separately typed words as if there were an AND between the words, while search engines with an alternative default treat the same two words as if there were an OR between the words. Thus, the results for the same search typed into two different search engines can be enormously different because one is inclusive, and the other alternative.

Inclusive Default Search Engines

Google HotBot Lycos

Alternative Default Search Engines

AltaVista Excite

Many search engines allow a researcher to designate alternative or inclusive through the use of the connectors OR and AND. Inclusion can also be designated using a plus sign as a word modifier:

apple OR blueberry

apple AND blueberry

+apple +blueberry

Keyword/Concept Default

Some search engines use automatic concept searching as a default. Many advanced online researchers are accustomed to keyword searching, where the exact string of characters typed in is searched. Thus, an advanced researcher who unwittingly uses a search engine with a concept searching default can become frustrated. Concept searching occurs when the engine not only searches for the exact character string, but also for word forms, and even synonyms and other words that statistically appear with the typed word.

Keyword Search Default Search Engines
AltaVista Google HotBot Lycos
Concept Search Default Search Engines
AltaVista (for some searches) Excite

Exclusion

Most search engines allow exclusion of search results that contain certain terms. Many engines recognize this feature by placing a minus sign or the word NOT in front of the term to be excluded. This feature should be used sparingly to avoid eliminating relevant results that might have a casual mention of the excluded term. Note that a minus sign modifies a single word, while NOT is a connector between words:

pieapple

pie NOT apple

Truncation

When using keyword, or exact match, searching, it can be helpful to command the search engine to locate pages where there are various forms of the word being sought. Typing the root of a word and adding a truncation symbol on the end can accomplish this. Most search engines recognize an asterisk as a truncation symbol. For example, if I wanted to find pages with various forms of the word independence, I would type independen* and the results would include pages that contain independence, independent, and independently.

Search Restrictors

Search restrictors in web search engines are similar to search fields in Westlaw. They allow a search for terms or values contained only in certain portions of a page, rather than anywhere in the entire page. A simple example is a search restricted to a type of domain, like .com or .edu. If a domain restriction is used, the search engine seeks results only where the url matches the designated domain type. Search restrictions are accomplished in different ways on different search engines, usually showing up in an engine’s advanced searching option. Serious researchers have long applauded HotBot’s search form, which makes restricted searching easy.

Title restrictions are often available. Use these with caution, perhaps as a first step to see what pops up. A title restriction reflects the title of the web page, designated by the web author. It may not necessarily correspond to the title of the document appearing on the page. For example, I might be looking for a copy of the Declaration of Independence. That document may appear on a web page entitled Historic Documents by the web author. If I restrict my search for “declaration of independence” to the title portion of pages, I will miss this page because it is actually called Historic Documents.

Date Searching

Searches can often be restricted by date. Additionally, dates often appear on the list of search results. However, like page titles, page dates can be somewhat misleading. The dates that are searched or reflected in results lists are the dates of the web page, and not necessarily the date of the document on the page. A search with a date restriction of July 4, 1776, will yield no results since no web pages were created or changed on that date. Thus, if I am searching for the Declaration of Independence, it won’t help me to try and place a date restriction in my search query. However, date restrictions can be useful to locate newly created or recently updated web pages, weeding out older results.

Phrase Searching

Most search engines recognize quotation marks around two or more terms as the designation of a phrase. Additionally, this can sometimes be accomplished by placing the Boolean connector ADJ between the terms. Thus, “apple pie” or apple ADJ pie will search for the phrase apple pie, and not search the two terms separately.

Nesting

Many search engines support the use of parentheses to nest various parts of a search query. For example, a search for apple or blueberry pie can be accomplished by nesting:

(apple or blueberry) ADJ pie

It can also be accomplished by searching two alternative phrases:

“apple pie” OR “blueberry pie”

Search Levels

It is often useful to perform a multi-level search, first casting a wide net, then narrowing by searching only within that set of results. This feature is offered by AltaVista, Google, HotBot and Lycos.

Results Features

When comparing search engines, search language is only half the story. Search results are also important. Search engines use various mathematical formulas to match terms from the search query to web pages containing those terms. These formulas take various factors into consideration to present lists of results often ranked by relevancy, at least, relevancy according to the formulas used. Some of the factors that go into the determination of relevancy are how closely together the terms appear, how many times they appear on the page, how close to the top of the page they are, and how unique they are.

Beyond pure relevancy rankings, however, many options are available to achieve a variety of results. Search engines present results quite differently, often without clearly explaining how the results are calculated or displayed. A serious researcher will seek to understand these differences and use them to her advantage.

Directory Results

Several years ago, before sophisticated portal sites were developed, there were two major ways to search for information on the web: directories and search engines. A directory is a collection of links to web sites which is classified into subject categories and subcategories.

As directories and search engines developed into overall portals, directories incorporated search engines and search engines incorporated directories. Portals have attempted to make these two entities appear seamless; however, they are two distinct finding tools. Understanding this concept allows the researcher to take more control over her searching.

Consider, for example, the classic directory, Yahoo! In a search for the Declaration of Independence, I can click through subject categories to locate it, or I can type “declaration of independence” in the search box. When searched, Yahoo! first searches its classified directory for subcategories entitled Declaration of Independence. If none are present, it then searches the directory for listed web sites entitled Declaration of Independence. If there are none, Yahoo! then uses search engine Google to search for web sites which contain the phrase Declaration of Independence. Yahoo! presents the first set of results it can, even if that happens to be the third step, web page results from Google. I do not have to prompt Yahoo! to move through to the next step if the first step found nothing; it happens automatically. This is why different searches on Yahoo! may produce results pages that look quite distinct.

Besides Yahoo!, there are two other major subject directories that have linked themselves with major search engines. The Open Directory Project provides directory results to Google, HotBot and Lycos, while LookSmart provides directory results to AltaVista and Excite.

Most Popular Results

As researchers began to realize that mathematical relevancy ranking didn’t always equal researchers’ intuitive relevancy ranking, tools were developed to put a more human factor back into relevancy determinations. Search engines can now measure what the most popular sites are, given certain search terms, and list the popular sites as results options. This is the driving force behind Direct Hit, which is used at HotBot and Lycos. Google and AltaVista include popularity as a factor in their formulas to determine relevancy rankings.

Customized Results

Most search engines allow the look of the results page to be changed, especially with regard to the number of hits per page. Additionally, they may offer the option of listing only titles or sorting by date or site, rather than relevancy.

Clustered/Compressed Results

Some searches produce many individual page hits from the same overall web site, making it seem like the results all come from the same place. When a search engine uses results compression, or clustering, it shows only one page per web site, while offering an option to view the other results from that site. This feature can be found at AltaVista, Excite, Google and HotBot.

Suggested Searches

Suggestions for further searching based on the initial search are provided by many search engines. These suggestions can be simple, such as synonyms or alternative search terms. They can be more sophisticated, such as suggestions for searching in different, specialized databases. Ask Jeeves is built entirely around suggested searches. If I type a question into Ask Jeeves’ search box, it returns a list of suggested specialized databases that might contain the answer to that question.

For example, I asked Jeeves “Where can I find the Declaration of Independence?” Jeeves returned several suggested sources for the text of the Declaration of Independence, as well as historical background on it.

Suggested searches can also be found at AltaVista, Excite, HotBot and Lycos.

Similar Searches

If I locate a web page that is highly relevant to my research issue, I might be interested in finding more pages that are very similar. Some search engines will perform a search for other similar pages at the click of a button. I simply choose a page from my results list and ask the engine to perform a second search to find similar pages. This feature can be found at Google (Similar Pages) and AltaVista (Related Pages).

Translated Results

A few years ago, AltaVista began offering a tool to translate a given results page from one language to another. The translations aren’t the greatest, but they’re better than nothing when confronted with results in an unfamiliar language. Google and Lycos also offer translation.

AltaVista
http://altavista.com

Default Searching: alternative in Basic Search; phrase in Advanced Search
Default Searching: keyword, but other concepts are also automatically searched in some situations

Search Language

Inclusion: + (plus sign) in Basic Search; AND in Basic or Advanced Search
Alternative: OR in Basic or Advanced Search
Exclusion: – (minus sign) in Basic Search; AND NOT in Basic or Advanced Search
Phrases: “” (quotation marks); in Basic Search, two terms that usually appear as a phrase are treated as a phrase even without quotes
Proximity: NEAR locates terms within ten words or each other
Case Sensitivity: lower case is insensitive; Capitalization forces case sensitivity
Truncation/Wildcard: * (asterisk) can be used in the middle of a word as well as at the end
Nesting: parentheses
Restrictors: host: url: link: domain: text: title: applet: object: anchor: image:
On Search Assistant Form: text, title, link, date, region, domain, host
Searching by Levels: yes
Other Search Features: special searches for images, video, and MP3/audio

Results

Automatic Directory Results: no (Help screens say yes, but I couldn’t find one instance where they appeared as search results); a separate directory can be browsed from the main page.
Popular: not a separate list, but popularity is built into AltaVista’s relevancy formula
Clustered: called site compression, is automatic in Basic Search and can be turned on in Advanced Search
Suggestions: yes
Similar: yes, called Related Pages
Translated: yes
Other Features: While Basic Search presents results ranked by relevancy, Advanced Search results will appear in random order unless the sort by box is used. Sort by allows users to place greater weight on certain terms.

Excite
http://www.excite.com

Default Searching: alternative
Default Searching: concept; use of Booleans forces keyword searching

Search Language

Inclusion: + (plus sign) or AND
Alternative: OR
Exclusion: – (minus sign) or NOT
Phrases: “” (quotation marks)
Proximity: no
Case Sensitivity: no
Truncation/Wildcard: no
Nesting: parentheses
Restrictors: language and country/domain on Advanced Search form
Searching by Levels: no
Other Search Features: very popular search topics offer relevant quick results in the left margin

Results

Automatic Directory Results: from LookSmart; results sites that appear in the directory will list category and subcategories on the results list; also, click on Web Directory from the results page
Popular: no
Clustered: yes, choose View by URL
Suggestions: yes, use Zoom In feature
Similar: no
Translated: no
Other Features:

Fast Search
http://www.alltheweb.com

Default Searching: inclusive
Default Searching: keyword

Search Language

Inclusion: + (plus sign); choose all the words from the pull down box
Alternative: parentheses around alternative words; choose any of the words from the pull down box
Exclusion: – (minus sign)
Phrases: “” (quotation marks); choose the exact phrase from the pull down box
Proximity: no
Case Sensitivity: no
Truncation/Wildcard: no
Nesting: no
Restrictors: on Advanced Search: language, text, title, url, domain
Searching by Levels: no
Other Search Features: easy refinement of original search from any results page

Results

Automatic Directory Results: no
Popular: no
Clustered: no
Suggestions: no
Similar: no
Translated: no
Other Features: basic or advanced search form, depending on which was used, remains at the bottom of each search results page and recalls previous search

Google
http://www.google.com

Default Searching: inclusive
Default Searching: keyword

Search Language

Inclusion: automatic; use + (plus sign) to include stopwords
Alternative: OR
Exclusion: – (minus sign)
Phrases: “” (quotation marks)
Proximity: no
Case Sensitivity: no
Truncation/Wildcard: no
Nesting: no
Restrictors: cache: link: related: info: spell: stocks: sites: allintitle: intitle: allinurl: inurl:; on Advanced Search form: language, title, url, domain, link
Searching by Levels: yes
Other Search Features: several specialty search engines, including one for government pages

Results

Automatic Directory Results: from Open Directory, relevant subject categories and subcategories appear at the top of results
Popular: not a separate list, but built into Google’s formula
Clustered: yes
Suggestions: for individual terms from the search query; click on a term from the box on the results page to see definitions and search suggestions for that term.
Similar: yes; can be chosen from the results list or accomplished directly from the Advanced Search form without performing an initial search
Translated: yes; pages published in Italian, French, Spanish, German and Portuguese can be translated into English
Other Features: offers the option of looking at the index’s cached page (what was actually searched) rather than the live page on the Internet; results list shows highlighted search terms in context

HotBot
http://hotbot.lycos.com

Default Searching: inclusive
Default Searching: keyword

Search Language

Inclusion: automatic; AND with Boolean phrase option; all the words from the pull down menu; + (plus sign)
Alternative: OR with Boolean phrase option; any of the words from the pull down menu
Exclusion: NOT with Boolean phrase option; must not contain from the pull down menu; – (minus sign)
Phrases: “” (quotation marks); exact phrase from the pull down menu
Proximity: no
Case Sensitivity: lower case not sensitive; Capitalization forces sensitivity
Truncation/Wildcard: * (asterisk) matches 0 or more characters; ? (question mark) matches one character only; they can be placed anywhere in the term
Nesting: yes, with Boolean phrase option
Restrictors: date, language, domain, depth, feature; search form also allows searches for different types of files
Searching by Levels: yes
Other Search Features: search form makes advanced searching easy to use

Results

Automatic Directory Results: yes, from Open Directory
Popular: yes
Clustered: yes
Suggestions: yes
Similar: yes
Translated: no
Other Features: will automatically run the same search in Lycos at the click of a button

Lycos
http://www.lycos.com

Default Searching: inclusive
Default Searching: keyword

Search Language

Inclusion: + (plus sign); all words on Advanced Search form
Alternative: any words on Advanced Search form
Exclusion: – (minus sign)
Phrases: “” (quotation marks); exact phrase on Advanced Search form
Proximity: no
Case Sensitivity: no
Truncation/Wildcard: no
Nesting: no
Restrictors: title, url, host, domain, language on Advanced Search form
Searching by Levels: yes
Other Search Features: has special content based searches for multimedia, recipes and more

Results

Automatic Directory Results: yes, from Open Directory
Popular: yes
Clustered: no
Suggestions: yes
Similar: no
Translated: yes
Other Features: will automatically run the same search in HotBot at the click of a button
Posted in: Features, Search Engines