Features – Remember the Internet Archive for Historical Research

Steven M. Cohen is Assistant Librarian for Rivkin Radler, LLP in Uniondale, New York. He is also the creator/webmaster of Library Stuff, a library and information science weblog. In addition, he is a contributing editor of the Internet Spotlight column for Public Libraries Magazine.

The Research Challenge

I walked into my office early one morning and found a request waiting on my desk. An attorney was looking to obtain an old section of the Indiana Court Rules – Rules of Appellate Procedure that had been amended and renumbered at some point within the past few years. Both the fee-based databases that would be used to extract such information only contained the current rules, and the web site maintained by the state of Indiana (http://www.in.gov/judiciary/courtrules) was under construction and only had copies of orders of recent rule changes, and not the full-text of the current rules (nor old rules for that matter).

An older copy of the current rules could have been easily obtained by contacting a local law library in Indiana, or a message could have been posted to the law-lib listserv, but that may have taken too much time, and possible money. In the middle of pounding out the plea to Indiana librarians, thoughts of the Invisible Web popped into my brain, and I was already typing in the URL for the Internet Archive (IA) before a smile surfaced. I pasted the court rules URL into the search box, and the request yielded three dates, August 9, 2000 being the earliest. I clicked on the link and was taken back in time, to the web site for the Indiana Court Rules, as it had appeared on that August date. The top of the page mentioned that these were the rules effective January 1, 1999. I clicked “Rules of Appellate Procedure”, printed the PDF file, and brought the 29 pages to the requested attorney, who was aghast and full of joy at my triumph.

The Internet Archive (IA) is a virtual time machine. A non-profit company, the IA is working to “prevent the Internet – a new medium with major historical significance – and other “born-digital” materials from disappearing into the past.” To date, the archive’s collection consists of 10 billion web pages, 16 million Usenet postings, 360 archival movies, and 5,000 pages from Arpanet (from the U.S. Department of Defense). Not only is IA a wonderful way to preserve the Internet, but is most helpful in answering reference questions and has been my assistant (or should that be the other way around) in many a legal research project.

Finding Historical Statutes

Finding historical statutes is a common query received by librarians, and Lexis and Westlaw have done a fine job in providing access to them, many back to the early 1990s, for a fee. The current statutes are always available on the respective states web sites, and many libraries have saved the yellowed pocket parts after the statutes have been amended. If these avenues prove to be fruitless or too costly, a trip to the IA may be in order.

For example, the Texas statutes can be found at the following URL: http://www.capitol.state.tx.us/statutes/statutes.html. They are current on the site as of 1999. Plug it into the Internet Archive and the following URL is provided: http://web.archive.org/web/*/http://www.capitol.state.tx.us/statutes/statutes.html. This site was archived once in 1997, twice in 1998, and four times in 1999. By clicking on the 1997 date, we are taken to the Table of Contents of the statutes as they appeared on that date. Clicking on any date will bring the user to that part of the code. Unfortunately, Texas has redirected the old statutes to the newer one, so, by clicking on the archived statute, the user may be brought to a newer version.

Many of the Ohio statutes have been archived back to 1998 (http://web.archive.org/web/*/http://orc.avv.com/). Although some of the statutes back that far have not been indexed by the IA, I was able to locate quite a few (for example, Chapter 2115 – http://web.archive.org/web/20000407135435/orc.avv.com/title-21/sec-2115/home.htm, which is from April of 2000). Finding statutes using the IA is a hit-or-miss situation, but it definitely could be used before going to the fee-based products.

Internet Archive Tip #1If at first you don’t succeed, try a shorter URL – The web site that you put into the IA may not have been archived, but the main index page or another sub-directory may have been. It is also possible that one of the directories may have been archived earlier. Remember, the links are live and you have to click through to find the page you want.

Court Decisions

Many of the state and federal court sites may have recent decisions posted even before they reach Lexis or Westlaw (if they are to be posted at all). As space becomes an issue on their servers, courts may take these cases offline, and they are not retrievable unless they are reported, or appear in a law journal. The IA allows for the retrieval of these cases. Many of the courts (especially the United States Court of Appeals via Findlaw) have opinions archived back to 1995, but there are some that only publish recent cases then take them off the server.

For example, the United States District Court, Central District of Illinois Orders and Opinions site (http://www.ilcd.uscourts.gov/orders&opinions.htm) only releases recent opinions. Using the IA, one can view full text (in PDF) opinions back to 1999. Also, on the Idaho Supreme Court Opinions Page (http://www2.state.id.us/judicial/sccivil.htm), one can retrieve opinions back to July of 2001. Using the IA, one can get them back to 1999. Upon initial review, many of the courts provide search engines for users to search for cases. I have found that the IA will only work in engines that are currently on servers, not those which have been taken down.

Internet Archive Tip #2 – When at all possible, Browse, Browse, Browse – In order to get the full effects of the IA, find a good starting point (usually the shortest URL possible), and go from there. Search engines, if they even exist on the site, will either turn up an error (the company may have changed the directory in which the engine searches) or will search the current data indexed by the engine.

Law Review and Legislative Information

Law review articles are widely available on Lexis and/or Westlaw, but many law schools place either full text or abstract copies on their web sites. Of course, if a certain article is needed, a trip to the journal web site would be the most cost effective avenue. Unfortunately, the IA was not helpful in locating many full text articles that were not readily available on the open Web. This shows that the archives in law review web sites are fully stocked with any and all past full text articles and abstracts. However, there are exceptions to this rule. The Cardozo Law Review web site only has back issues to November 2001. Utilizing the IA, you can obtain access to this law review back to October, 1999.

Legislative information, although sometimes difficult to locate on the state web sites, has usually been archived far back enough that the use of the IA would not be helpful. Looking at the New Jersey Legislature site, one can search bills back to 1996, peruse audit reports back to 1996, and even look at legislative digests from 1996 on. This concurrent date struck a chord, and I reviewed other state legislature web sites as well. The Iowa legislature has a link on its web site that will bring the user to all of the archives available (http://www.legis.state.ia.us/Archives.html), and most of the information can be traced back to the magical year of 1996 as well. Plus, past senate and house members can be accessed using this same page. Noting this information, it is easy to see that the state government has a handle on placing older data on their respective legislative web sites. There are instances, however, where the IA will come in handy when researching legislative information. I always keep it in the back of my mind.

Internet Archive Tip #3 – Beware of Robots.txt – While perusing through a web site via the IA, users may come across the following message “We’re sorry, access to [your URL] has been blocked by the site owner via robots.txt.” Like search engines, the software that archives each web site can be blocked from crawling the site if the owner uses places this robots.txt file in the code. The site owner may also have contacted the IA and asked not to have their site crawled. The New York State Web site is one example. Every directory on this domain has been blocked.

There are state web sites that haven’t asked the IA not to index their pages. For example, the California state web site has been archived back to 1996. While perusing the archived page, I located speeches by then Governor Wilson, which had all been removed after the election of Gray Davis. The Missouri and Minnesota state web sites have also been archived back to 1996. State web sites, as we well know, can be an important tool in legal research.

Tip #4 – Google Cache? – The appeal of the cache in almost every Google search is the ability to see a “file not found” site the last time it was indexed before going offline. The IA takes the cache further by allowing end users to see the site at different dates prior to the present cached page.


There are so many different ways that the IA can help the legal researcher. News articles that may not have been indexed by the main site can be easily located. Press releases by companies that may have folded can also be found using the IA. After finding that needle in a haystack in Indiana mentioned at the top of this article, I now keep a post-it note on my computer monitor that reads, “Remember the Archive!!” Those are three words to live by.

Posted in: Features, Internet Resources, Search Engines