Why the DPLA should focus on being a stellar ACADEMIC library: Check out these statistics

Related: Why the DPLA is not a public library despite its many virtues and the P word in its name.

dplahomepage2014Just for fun, I searched the Digital Public Library of America to see how many listings showed up when I keyed in possibilities ranging from civil war to chemistry.

I tried the basic search mode. Here’s what I found out of more than seven-million-items:

Civil war (with surrounding quote marks): 41,273 results from the DPLA. They include books, photos, videos and sound. Amazon’s statistic: 224,176 mostly book-related results for civil war within the “Books” category. Limited to Kindle titles: 10,778.

Shakespeare: 14,770 results, also for both books and nonbooks (the case with all DPLA stats here). Amazon total: 104,041. Kindle: 5,333.

Biology: 14,335. Amazon: 272,155. Kindle: 13,622.

Chemistry: 14,047. Amazon: 311,360. Kindle: 13,292.

Mathematics: 10,541. Amazon: 666,490. Kindle: 57,686.

Physics: 9,125. Amazon: 280,096. Kindle: 13,287.

Theodore Roosevelt (no surrounding quotes): 4,570. Amazon: 16,318. Kindle: 376.

Cosmology: 489. Amazon: 10,528. Kindle: 1,722.

Toni Morrison (no surrounding quotes): 74 (no novels). Amazon: 2,817. Kindle: 146.

Did you pick up some patterns here, based on these quick comparisons? First, yes, the DPLA’s count of more than seven million items seems impressive, and, in fact, it is even though just a fraction are actual books. But if you run basic searches on specific subjects, the numbers for some essential topics just aren’t that great—especially compared to Amazon’s paper books.

Granted, many paper titles listed on Amazon are obsolete, and more than a few surely are “orphans” without traceable copyright holders to authorize digitization. But the bottom line is the same even if we adjust for this. The DPLA still lags badly if we include paper books. This isn’t to say, “Hey, DPLA, you should have matched Amazon and Google by now.” Unfair, of course. I’m just pointing out the size of the task in the future if the DPLA wants to become an academic digital library system worthy of a super power. The good news is that there are ways to deal with the financial issues if our business and political communities show sufficient resolve and the DPLA reinvents itself, ideally working with a public digital library system for the U.S.

Second, my casual impression is that the DPLA could be stronger in science and math in some important ways if the above numbers are representative. Shouldn’t physics count more in the DPLA’s world than it does now?

Third, keep in mind that many and perhaps even most of the DPLA’s listings are for nonbooks. Not necessarily bad. I much appreciate the wonderful visual and audio goodies online for creators inside and outside academia, as well as invaluable source material such as old letters. Still, that brings down the book totals. Of the 74 results for Toni Morrison, for example, 62 are images and only 12 are texts of any length. And as noted earlier, you won’t find Morrison novels among the DPLA listings. Of the more seven million items in the DPLA catalog, only 1.6 million are actual books, mostly public domain titles many decades old.

Now, I already can imagine the DPLA’s reply, and very possibly you can, too: Yep, we’re just getting started. Haven’t we done an incredible job with the limited resources we have, as well as the restrictions of copyright law, and don’t we have some awesome open access breakthroughs on the way?

The DPLA in fact comes with many positives. But it still will need collection and business strategies to take it to the top and become a truly comprehensive academic digital library if that’s the goal. DPLAers keep insisting that the organization is a public library even though the academic and archival content in the catalog is just a subset of what a true general public library collection would offer. Where is John Grisham? James Patterson? Gillian Flynn? Nora Roberts? Bestselling nonfiction writers like Steve Jobs biographer Walter Isaacson?

Simply put, the DPLA is an academic library portal or a search engine rather than a public library. Even by those standards, however, the DPLA has a long way to go if it wants to transform education and research to the max at small institutions without the wealth of a Harvard or Princeton.

One challenge will be the limits of the open access model that the DPLA favors. Even within the academic category, most of Amazon’s books are still copyrighted the usual way, reflective of the industry as a whole. Just how far, then, will open access get the DPLA by both academic and public library standards? Wikipedia and Creative Commons are terrific, but by themselves those open access models and others won’t give the public what it needs and wants. Will Toni Morrison eagerly jump in with her fiction? I myself will be excited about anything the DPLA can do to work out special OA arrangements with publishers and writers. But many and perhaps most will not go along, and there are other factors, too, including the D.C. political scene. The future could be worse, not better.

Commendably, Dan Cohen, the DPLA’s gifted executive director, is fighting a proposal from copyright hawks to tighten the laws and severely restrict unauthorized linking—the essence of the Web. I believe that Dan and the other good guys will win. But isn’t it revealing that the U.S. Copyright Office feels compelled to take this radical plan seriously in the first place? The DPLA needs to carry on the copyright battles but recognize that good intentions and idealism can’t substitute for cash to grow collections in the current legal environment, which is likely to be with us for a long time. A national digital library endowment could help make many more works available online even if they were copyrighted. Not all would need to be digitized by the DPLA. Countless titles would already exist in electronic form from publishers or other sources.

Likewise helpful would be the creation of a separate public digital system whose librarians wanted a healthy amount of available resources to go for popular-level content, including encumbered bestsellers. Remember, most public libraries are community institutions dedicated to recreation rather than just education, self-improvement and other lofty goals. Ironically, however, reading for fun can boost academic achievement. Let’s not commit Readicide!

The endowment could raise money for both systems. Furthermore, as noted by Jim Duncan of the Colorado Library Consortium, it could help them stay on their respective missions and sort out their priorities. The two systems could share technical services and infrastructure and strive to host as many books and other items as possible on the systems’ servers (interbook links between serious works need to be truly permanent in the era of networked books).

A good first step might be for the Harvard Law School’s Berkman Center to offer to host both an embryonic endowment and an embryonic public library system. The endowment should be more representative of the library world than the current DPLA, which hasn’t paid enough attention to the general collection needs of small libraries. Genuine public librarians should establish the embryonic public system, with Berkman and other Harvard people facilitating rather than directing. The public system and the DPLA, reinvented as the Digital Academic Library of America or something similar, could both eventually become government agencies with clear-cut missions. That’s my hope.

For this to happen, though, the DPLA needs to drop the confusing P word and focus on being a full-strength academic system.

Doesn’t Harvard, the original home of the DPLA, boast one of the university world’s largest collections of paper books, maybe even the biggest? As a digital academic library, the DPLA should be no less ambitious within its own realm, not just with the raw numbers but also with the actual quality and usefulness of the items in its catalog.

I don’t merely want the DPLA to be a world class academic library by open access standards or portal standards. I want it to be world class, period—with far more resources that it has right now.

Detail: Some of the Amazon figures varied when I accessed the catalog at different times, but not enough to do away with the big gaps between the company’s offerings and the DPLA’s in crucial areas.

Note, 6:17 p.m., May 6: This is a “first edition,” and I may be making changes later on.

Editor’s note – this article was re-published with the author’s permission from his blog, Library City

Posted in: Features