Oh Lord, please don’t let Google Book Search be misunderstood

Has there ever been an online initiative more misunderstood than Google Book Search? If there has, it doesn’t readily come to mind. The purpose of Google Books has been muddled by its adversaries (and, sometimes, allies) ever since Google first started shoving books into scanners. Even now, a year after the Supreme Court finally closed the book on the long-running Authors Guild lawsuit against it, its purpose continues to be misunderstood by both allies and adversaries.

Let’s start with this piece on Medium’s Backchannel, by Scott Rosenberg. The invalid assumption is right there in the headline: “How Google Book Search Got Lost.” There’s a headscratcher for you. Did Google Book Search get “lost”? Maybe it’s not featured on the front page of google.com, but if I go to http://books.google.com, why there it is.

Rosenberg seems to have been led into believing that the whole purpose of Google Book Search had been to serve as a sort of “celestial library” (or maybe “celestial bookstore”) where people could access any book they wanted online—or at least any “orphan work,” which he mischaracterizes as out-of-print books. (They’re actually out-of-print books whose rights holders cannot easily be found.) Rosenberg is disappointed that all it’s amounted to in the end is a text search box. He concludes that, over the course of the long-running legal battle, Google Books “lost its drive and ambition.”

From the Authors Guild side, James Gleick has written a rebuttal in which he also expresses regret that the proposed Google Books settlement didn’t come to pass, and discusses the need for some sort of better solution to permit access to orphan works. However, Gleick also points out much the same thing that I did in one of my earlier posts about Google Books—all that stuff with serving as an ebook store for orphan works and holding the proceeds in escrow was not Google’s idea.

As Gleick himself admits, Google Books was never intended to make any non-public-domain books fully available, until the Authors Guild proposed its settlement that Judge Chin shot down for overreaching. It was meant from the outset as an indexing tool for a search engine, and that’s all.

Gleick adds:

The point of the ill-fated settlement with Google was to give those books a new life—creating a platform in which readers or libraries could pay a small amount for these older copyrighted books and authors could receive a bit of compensation. The Authors Guild hasn’t given up on making that possible. We’re working with some libraries on ways to do it, and we hope to have more to say about that soon.

Making orphan works more readily available is a laudable goal, even if the Google Books settlement was the wrong way to go about it. If the Authors Guild can work out a way to do that which will pass legal muster, well, more power to them.

But then Gleick comes out with this whopper, which seems to be at least as historically revisionist as all the people who think Google Books was always intended to make in-copyright books fully available:

We authors, for our part, didn’t object to Google’s creating of a search index. In itself, search had obvious benefits for everyone, readers and writers alike. We objected to Google’s seizing without permission the full texts of copyrighted books for profit-making purposes not limited to indexing and never, in fact, fully disclosed. These books are enormously valuable to anyone working on algorithmic translation and machine learning.

Really? In all the time I have been following the Google Books story for TeleRead, I have never seen the Authors Guild come right out and say that scanning the works to build a search index is fine but profiting in other ways isn’t. It always came down to something to the effect of “we said in our copyright notices Thou Shalt Not Scan At All, but Google went ahead and scanned anyway—and they’re making money off a search engine for authors’ hard work without first getting permission from us. (Or sharing any of the take.)”

Indeed, in a piece from last year that Gleick links to in his own article, Authors Guild council member Richard Russo depicts the Guild’s problem as exactly that: Google scanning books without permission to create a search index at all. Russo certainly doesn’t say, as Gleick does, that it would be fine and dandy for Google to create a search index without permission if it weren’t for nebulous “never […] fully disclosed” other “profit-making purposes”. (Though one of the comments below the article does point out Google didn’t say anything about whether it would be using the data for other purposes, such as machine learning.)

Coming back to Rosenberg for a moment, Rosenberg notes:

As the Authors Guild’s Gleick points out, Google started Books with a “better ask forgiveness than permission” attitude that’s common today in the world of startups. In a sense, the company behaved like the Uber of intellectual property — a kind of read-sharing service — while expecting to be seen the way it saw itself, as a beneficent pantheon of wizards serving the entire human species. It was naive, and the stubborn opposition it aroused came as a shock.

I’m not a lawyer, of course, but as I understand it, there never was any need for Google to ask permission, if it believed it was making a fair use of the copyrighted material. (And, as subsequent court decisions bore out, it was.) The whole point of “fair use” is that it’s something you have the right to do without asking permission. Rights holders are always free to disagree and take legal action (as, indeed, the Authors Guild did), but American copyright has a strong tradition of permitting expansive fair uses, even by for-profit entities. That’s not presumptive or “naive,” that’s just the way copyright law intersects with the rights of those who would use the material.

(And that’s leaving aside the fact that asking permission of every single publisher for every single book would be a sisyphean task in any case. But then, that’s why we have fair use rights to begin with; under fair use law, Google didn’t need to ask permission to scan any more than a student has to get permission from every publisher to photocopy pages out of any book he needs as research material.)

But fair use is a contentious matter in the digital world, and Google Books isn’t the only example of that. Another recent example pits YouTube content creators against rights holders in the matter of YouTube videos that make fair use of copyrighted material. It seems that making it easy to copy material digitally opens up multiple cans of worms when it comes to deciding just how much copying should be allowed.

When you come right down to it, after all the legal wrangling and bluster is out of the way, Google Books is still chugging right along doing exactly what Google said its goals were all the way back in 2004: making printed books as easy to search as the web. And the effect that has on scholarly research should not be understated. It’s so much easier now to find exactly the information you need, rather than having to riffle through a card catalog and hope that a book will have useful information in it. If Google can make a few bucks off doing that, well then, more power to Google. Many useful innovations have come about due to someone’s desire to make a profit.

What I’d really like to know is why the Authors Guild has apparently never seen fit to do anything about the Internet Archive’s OpenLibrary, which violates copyrights in exactly the way Google Books doesn’t. I wonder if they’ll ever take any notice of that particular full-text scan project?

In any event, Google Books never got “lost” in the way Scott Rosenberg seems to think. It always knew exactly where it was—and it’s still there.

Editor’s note – this article was republished with the permission of the author who published it first on his site, TeleRead. Part two of his article is also republished on LLRX – Google Books is not Alexandria redux.

Posted in: Copyright, Intellectual Property, Legal Research, Librarian Resources, Libraries & Librarians, Search Engines