ResearchWire – Exposing the Invisible Web

Diana Botluk is a reference librarian at the Judge Kathryn J. DuFour Law Library at the Catholic University of America in Washington, D.C., and is the author of the The Legal List: Research on the Internet . She teaches legal research at CAPCON , Catholic University Law School , and the University of Maryland . Take a class with Diana! Here’s how…


Every Web researcher knows general search engines are an integral part of the process in locating information on the Web. They can be both a genuine asset and a source of some indubitable frustration. As Genie reported last month, search engines index at best 16% of all Web pages. This brings to mind pictures of a faithful yet frustrated researcher throwing her hands in the air, exclaiming “WHY ISN’T THIS THING WORKING?”

In her Research Wire column last month, Genie Tyburski explained a number of reasons which might answer exactly that question, including password-restricted free data, password-restricted commercial data, and dynamically delivered data. This month we’ll look a little more closely at information stored in databases and how to access it.

The concept is simple, really, and one that is almost second nature to researchers using such services as Dialog, Lexis or Westlaw. Millions of documents exist online in thousands of databases, but before a search for a needed document can begin, the researcher must choose the appropriate database to search through.

The Web, at first glance, may appear to be different because we see general finding tools like Yahoo! or AltaVista that are enormous in scope. But as broad as these finding tools are, they cannot tap the documents stored in databases whose documents they do not reach. Thus, pinpointing this information is a two step process. First, use a broad finding tool to locate the database, then search for individual documents within that database.

An example of this procedure illustrates the point. A researcher must find a copy of House Report 106-224 about the National Marine Sanctuaries Enhancement Act of 1999. A search using general search engines for the phrase “national marine sanctuaries enhancement act” combined with “106-224” retrieves no results. Remove the report number from the search statement and a few items are found, but none of them lead directly to the report itself.

However, the report does exist online and can be easily accessed by starting at either THOMAS or GPO Access, and finding the appropriate database containing House reports from the 106th Congress. Online researchers must use their skills to browse through finding aids and directories to discover where the right database might reside. This is good news for information professionals! We already know how to do this, and it’s just a matter of transferring our skills to a new venue.

Of course, there is always help out there if you know where to look for it. First of all, specialized directories like FindLaw or Catalaw would easily point researchers to databases where they can locate law related information. But there are other finding tools on the Web that guide the way to databases on any subject.

IntelliSeek provides us with InvisibleWeb.com, a directory of over 10,000 searchable databases on the World Wide Web. It took less than thirty seconds to locate THOMAS as a resource for 106th Congress committee reports by browsing through the legislative portion of the government category. Lycos has also incorporated IntelliSeek’s Invisible Web Catalog into their directory.

Another directory of databases is direct search. This directory appears to limit itself to databases containing more serious or scholarly information, omitting those that exist purely for entertainment value, not bad news for those seeking law related information.

WebData also provides a directory of thousands of online databases, classified into a wide variety of categories. The directory entries contain thorough descriptions of the contents of each database.

Meta search engine the BigHub.com also provides a categorized list of specialty search engines that can be found throughout the World Wide Web. Its wide variety of topics range from science to crosswords. A special feature of this directory is that each database can be searched directly from the BigHub’s pages.

The bottom line is that searching the Web requires the researcher to think creatively in her approach to her quest for information. Searching through a database known to contain the desired type of information is always preferable to floundering with an overly broad search engine. So remember the two step process and find a database that’s right for your next search.

Posted in: ResearchWire, Search Strategies