The Government Domain: GPO Access and THOMAS for Legislative Research

By Peggy Garvin

Peggy Garvin of Garvin Information Consulting is author of The United States Government Internet Manual (Bernan Press) and contributing author for The Congressional Deskbook (TheCapitol.Net) .

Published June 23, 2005

GPO Access and THOMAS are essential congressional research systems sponsored by the legislative branch of the U.S. government. Both are available online for free. GPO Access and THOMAS each take a different approach to legislative information and most researchers need to use both. The question is not “which is better” but rather “which is better for the task at hand?”

Contents


Comparative Overview

To understand the differences between GPO Access and THOMAS, it helps to know a little about the origin of each. GPO Access was authorized in 1993 by Public Law 103-40, which called for the Government Printing Office (GPO) to “provide a system of online access to the Congressional Record, the Federal Register, and, as determined by the Superintendent of Documents, other appropriate publications distributed by the Superintendent of Documents” (44 USC sec. 4101(a)(2)). The Legislative Branch section of GPO Access maintains this document-centric approach. GPO’s focus, and its strength, is providing online versions of official documents. They offer PDF versions that retain the helpful formatting of the print originals, and they have developed useful ways of browsing and paging through the documents online. ( GPO Access carries much more than legislative materials, of course, but this column will focus on the site’s Legislative Branch section.)

THOMAS was launched by the Library of Congress in January 1995 – the start of the 104th Congress – at the request of Congress. At the time, the House of Representatives, the Senate, the Library of Congress, and the Library’s Congressional Research Service (CRS) already had a history of sharing data for mainframe legislative information systems. Although not part of the initial launch, the core of THOMAS soon became the CRS “Bill Summary and Status” (BS&S) database, which was migrated from the Library of Congress mainframe. BS&S features the CRS summaries of legislation and daily updates on the status of legislation. It integrates this information with selected full-text files from GPO and with other data, such as roll call votes, from the House and Senate web sites.

THOMAS’s strength is the integration of data and documents from many legislative sources, including GPO Access. The legislative status steps for a bill are linked to the relevant primary documents. The screen shot shown here displays linked status steps for “major congressional actions” on H.R. 1268 of the 109th Congress. Links go to the Clerk of the House site for a roll call vote and to GPO Access for the text of the public law1 , as well as to other congressional web resources. THOMAS also offers more detailed status displays than the one shown here. They are labeled “all congressional actions” and “all congressional actions with amendments,” and these also link to relevant web resources.

Content

The documents offered by THOMAS and GPO Access overlap to a certain degree, but each system includes unique content.

GPO Access has copies of published congressional hearings while THOMAS does not, for example. And because THOMAS drew on an institutional history of maintaining mainframe legislative information systems, some of its files go relatively far back for web content.

The BS&S database has summaries and status information for legislation back to 1973, for the 93rd Congress.

Key Documents on GPO and THOMAS

Bills and Resolutions, all versions, full text
THOMAS – 1989 (101st Congress) to present
GPO Access – 1993 (103rd Congress) to present

Committee Reports
THOMAS – 1995 (104th Congress) to present
GPO Access – 1995 (104th Congress) to present

Congressional Record (Daily Edition)
THOMAS – 1989 (101st Congress) to present
GPO Access – 1994 (103rd Congress, 2nd session) to present

The Law Librarians’ Society of Washington, DC (LLSDC) maintains excellent inventories of congressional information online. These LLSDC lists cover THOMAS and GPO Access along with content from other free sources and from commercial subscription services:

One of the most frequently asked questions about THOMAS and GPO Access content is “which will be quickest to post the information I need?” Each system is updated throughout the day, and my general advice is to check both. Since GPO produces the official documents such as the Congressional Record, GPO Access tends to have these online first, but THOMAS picks them up quickly. The ultimate origin of the information is Congress, of course. When content is delayed, the cause is often Congress. When the House or Senate remains in session late into the night, the Congressional Record for those proceedings may appear online a bit late the following day. And officially published congressional hearings, not a printing priority for Congress, may appear one or more years after the event.

At this point, neither THOMAS nor GPO Access offers a current awareness email or RSS system to alert researchers when new legislative content is posted2 . Researchers looking for current awareness services and alternative sources should explore the other systems listed on the LLSDC documents cited above.

Searching

As many average citizens turned amateur searchers can attest, online keyword searching on any system is challenging. Searching GPO Access and THOMAS can be challenging due to the specialized nature of the content, but also because both employ search engines that users are not likely to encounter elsewhere. Both GPO Access and THOMAS are over ten years old, and both – at the time of this writing – are using variations of the same search software they launched with.

GPO’s primary search engine is the early 1990’s phenomenon WAIS (Wide Area Information Servers), which is near extinction anywhere on the web outside of GPO Access. THOMAS uses the InQuery information retrieval package, which was popular with several government agency web pioneers in the 1990’s. (InQuery was developed by the Center for Intelligent Information Retrieval at the University of Massachusetts, Amherst, and later commercialized.) I do not have any news to share about THOMAS, but GPO is actively planning new systems; these are detailed in their Concept of Operations for the Future Digital System (PDF).

For the time being, I have a few simple tips:

  • First and foremost, read the search help and consult the sample searches that both sites provide for each of their databases. While GPO Access and THOMAS may not have cutting-edge search software, they do provide extensive help designed specifically to assist with their content. One look at the search tips will save you mountains of time and frustration.
  • When you can, on both GPO Access and THOMAS, use other fields or limits to decrease your dependence on specific search terms. For example, limit a bill search on THOMAS by sponsor and stage in the legislative process, if you know it.
  • Both systems cap initial search results at a fixed number; 40 on GPO Access, and 50 on THOMAS. If your searches always come up with 40 or 50 results, you either need to raise the limit – it can be changed easily on either system – or narrow your search.
  • On GPO Access, remember that you must use double quotes or a Boolean AND to get results that include all of your search words. A space in GPO Access is interpreted as a Boolean OR.
  • Note that THOMAS provides some flexibility in word or topic searching. The word/phrase search box on most of its databases has the option to search for exact matches or for plurals and other variations. The BS&S file has an option, albeit not well integrated, to find and search on preferred terms from its Legislative Indexing Vocabulary.

Browsing

GPO Access and THOMAS take different approaches to browsing, whether it be browsing all documents in a set or browsing within one document.

Browse options can also vary from database to database on either system. GPO makes browsing easier for the Congressional Record. If you want to read the Record rather than search it, choose GPO over THOMAS.

GPO also has an easy-to-use interface for retrieving a page in the Record by page number – handy since that is how items in the Record are cited. On other hand, GPO has no browse list for committee reports while THOMAS does .

Using any legislative information system involves the challenge of dealing with some very large documents. On THOMAS, large bills and reports are chunked into bite-size pieces and an HTML table of contents lets you jump from bit to bit. When you want to see it all at once, choose the “Printer Friendly Display” option. GPO serves up its documents in their entirety, in either plain text or PDF. Savvy LLRX readers know that you can jump to specific words within a large document by displaying it in its entirety and using the Edit/Find menu option (or Control-F for PCs; Command-F for Macs). But it never ceases to amaze me how many of our citizen searchers are getting along without this knowledge.

Which to Choose

Legislative researchers should be familiar with both THOMAS and GPO Access, at the very least. In addition to their complementary features and content, these two sites have enough overlap so that one can serve as a backup when the other is experiencing technical difficulties. But which to choose first?

For straightforward document retrieval, I prefer GPO Access. GPO is the source of the official documents, offers PDF versions, and provides quick tips for pulling up just the document I want.

When I need to do more research, I start with THOMAS. The integration of data and documents and the value added by the summaries and status steps provide the clues I need when I am not absolutely sure of what I am looking for or what I might find.

GovTrack Update

The March 2005 column looked at GovTrack.us, a privately-run legislative information service available online free of charge. Like THOMAS, GovTrack integrates data from various web sources. Unlike THOMAS, GovTrack has email and RSS alerting services.

Since the March column, GovTrack has expanded its scanned sources to include:
Congressional Budget Office Cost Estimates
Office of Management and Budget Statements of Administration Policy
House Whip Notices
• Amendments to bills, and
• All roll call votes (previously just covered votes on legislation).

GovTrack has also added a Congressional Record browse feature similar to THOMAS’s, but with integrated links to information on the bill when a specific bill is mentioned.

Footnotes

1 THOMAS has the full text of all versions of all bills as they move through Congress, including the enrolled version, the final version of a bill before it goes to the President. THOMAS does not host its own database of public laws, which are technically documents of the executive branch, but does link to the GPO copies.

2 GPO plans to launch an RSS feed later this year to announce when congressional documents are posted online. LLSDC currently offers such a service on a weekly basis (see GPO Congressional Publication Releases.)

Posted in: The Government Domain