Rita Vine is a professional librarian and co-founder of Workingfaster.com, which helps professionals break through the clutter of the Internet and access information that matters. She teaches web searching to clients across North America, and serves on the selection team of the Search Portfolio, an enterprise product of the 100 top starting points for searching the free web. A “lite” version of the Search Portfolio, with links to about 10% of its resources, is available at http://www.searchportfolio.com/searchlite.html.
News of Google’s initial public offering has turned the search engine business into headline news. Reports of the pending IPO have sent both big-name competitors and second tier search properties into a tizzy of activity and initiatives. Many of these smaller players hope against hope that Google will lose its edge as its employees, who cashed out big in the IPO, will pay more attention to their new Hummers than to the business. And over it all, rumors of a huge but mysterious play by Microsoft to outdo Google in the search/research wars linger in the background.
For serious web searchers more interested in search engines as information seeking tools rather than investment vehicles, the current buzz surrounding search engines disguises the fact that, despite persistent attempts by the search engines at promoting their indexes as better than ever, search is getting worse.
Why is searching getting harder and less productive? And what might reasonably happen over the next several months to search engines as we know them? And finally, what does this mean for searchers?
Search engine indexes are way too big.
The enormity of search engine databases is making retrieval overwhelming. Even with highly specific search words, it’s not uncommon to retrieve thousands, even hundreds of thousands of potentially relevant results. For almost every search, users simply have to stop at some point, when time for assessing and evaluating runs out.
Keyword guessing is getting harder.
In a giant database of full text pages, practically any search produces enormous search results. Users are faced repeatedly with the task of guessing the words that are likely to appear on relevant pages. And even so, the number of pages that are presented may be so large as to make it almost impossible to choose the most relevant site. So searchers add words to try to limit the number of results, which makes it more likely that they will miss potentially relevant pages that used different words than the ones selected.
At the same time, search engine optimizers are getting better at knowing which keywords searchers will use. They know that searchers are using what several authors and trainers have called “magic search words” – like links, resources, or geographic qualifiers – to limit search results. Companies are now purchasing those words too, and are optimizing the content of their site to include those “magic” words. An increasing number of businesses are using on- and off-the-page optimization techniques, and more organizations are purchasing keywords and phrases in order to be noticed. The end result is a game of cat and mouse that searchers can never seem to win.
Search engines use outdated retrieval-and-sort mechanisms.
Search engines haven’t done a good job of keeping up with developments in advanced concept engineering. In fact, the page ranking algorithms of the major engines are almost startlingly simplistic. The combination of link analysis, domain preference assignment (where selected domains or domain extensions – like .gov – are given added weight in the rank ordering scheme), and lingering elements of keyword placement, proximity, and frequency are the methods still favored by the major search engines. Although a substantial body of improvements on linguistic analysis of search requests exists, no free commercial search engine uses them.
Search engines haven’t been able to fight off the scourge of search engine spam and link spam. Search engine optimizers have largely succeeded at defeating page-ranking algorithms, particularly in key industries like entertainment, consumer technology, and travel. In fairness to the optimizers, the search engines haven’t really given them much resistance. The search engines’ simplistic responses to search engine spam have been limited mainly to domain preference assignment, a subjective and totally reactive method. The selection of domain preferences suffers additionally from a bias that favors American sites: for example, the favoring of .gov extensions will help U.S. government sites rise to the top of the search results. But what does that do for British or Canadian searchers?
Organizations are starting to pay for content that could be found for free on the web – if only the searcher knew where to look.
Of the online products and services offered through content aggregators, at least some of it – including full text journal content, dictionaries, directories and statistical compilations – is freely available on the web, if you know where to look. But organizations recognize that their employees’ valuable time is wasted searching aimlessly without success, and the cost of easier-to-search subscription-based products may offer a ready-made solution to this problem.
Search weariness may finally be setting in.
It’s taken a long time for searchers to get sick of searching, but it may be happening. I and several of my colleagues who teach web searching have heard more complaints than usual from our students lately about relevance of Google’s search results. Although measurements from sources such as the American Customer Satisfaction Index (http://www.theacsi.org/second_quarter.htm) show practically no movement in either direction on satisfaction levels with the major search tools, user-behavior-watchers will want to check out the index in the coming months to see if a shift occurs.
As search engines learned to their dismay in 1999 when Google became the search category killer, search engine users can be notoriously fickle — it doesn’t take long for a user to leave a search engine that fails to deliver relevance.
So, if search is getting worse, why aren’t search engines doing everything possible to improve the search experience for their users? The reason has little to do with search, and everything to do with revenue. Search engines can’t generate enough revenue from search alone.
In order for search engines to make real money and deliver real shareholder value consistently over many quarters, they must enlarge their focus, often through acquisition of other related companies, and become portals. It’s therefore no surprise that in the last few months, Google and Yahoo have been duking it out for dominance in super-sized email services, toolbars, user group sites, and whatever else can bring ads to eyeballs as often as possible. Search properties have to extend their brand identities beyond search in order to develop and enhance revenue streams.
In “Return to the sad days of more than a search engine,” (http://searchenginewatch.com/searchday/article.php/3354071) Danny Sullivan of Search Engine Watch noted the paradox in this approach. In the late 1990s, a similar reduction of focus on search in favor of other revenue generating web based activities (shopping, email, news, auctions) led to the demise of several portal properties. Sullivan points out that it was that very inattentiveness to search that created the gap that enabled upstart Google to flourish.
Google is starting to look a lot like a portal, although all the pieces aren’t yet together on the front Google page in the way that they are in Yahoo. Although Yahoo has recently made a lot of public noise about its renewed focus on search, it has never really abandoned its portal model. Portals have always lured web users to their sites with search, but then diverted their eyeballs with other treats.
But who can blame them? Shareholders of search properties
have an insatiable need for ever-increasing revenue, and the portal model has been the only rational way to increase revenues quickly and efficiently. Searchers who browbeat search engines for failing to live up to their promise as search tools fail to fully understand the business models of search engines. They aren’t in the search business, they are in the advertising business, and critics need only look at practically every other global media company to understand that the ad-revenue model works.What does the future hold for search engines and searchers?
There is little doubt that both Google and Yahoo will continue to grow their portals. At the present time, generating as much ad revenue as possible is the only viable business model for a free service that depends on huge amounts of traffic for success. The only other pure-search player of significance is Gigablast, and its principals care about improving search query interpretation. However, with a search index that is barely 1/10th of Google’s in size, and the retrieve-and-sort methodology that isn’t noticeably more robust than either Google or Yahoo’s, Gigablast doesn’t have enough going for it to create a mass migration away from Google. The other major player, Microsoft, is expected to enter the search arena in the coming months with a major new product. What that will look like is anyone’s guess, but given Microsoft’s deals with LexisNexis, Gale, and other paid content suppliers, it is reasonable to assume that Microsoft may expect to earn revenue from its search site – not exclusively from advertising, but also by selling (probably cheap) paid content on demand.
Has the time come for mass market paid web content?
Large businesses, academic institutions, and public libraries have spent the last few years building their own in-house information portals filled with paid content – usually bundles of indexes and full text purchased from 3rd party providers. Leaving aside adult-oriented web services, which profitably sell gambling and porn subscriptions to individual consumers, there hasn’t been a lot of take up of paid print content by individual consumers. After many years of trying with relatively little buying, the conditions might be ripe for the emergence of pay-per-view content as a viable alternative for search-weary consumers.
1. Content prices are dropping. Individuals can buy an annual subscription to Highbeam Research’s eLibrary, consisting of full-text documents from 2,600 sources, for under $100 per year. News junkies can subscribe to KeepMedia’s full text service from 150 popular magazine and news titles for under $50 per year. At some point, price points like these become attractive options for busy consumers.
2. Users are becoming much more comfortable with online financial transactions. E-commerce backends are becoming better, faster and more reliable, enabling a more dependable user shopping experience. If consumers are comfortable buying books online, how much more will it take to sell them a journal article?
3. Users are getting more used to paying for content through pay-per-view movies, and value-extra cable services. The idea of pay-per-view is something that is gaining acceptance.
4. As they have come to expect the Internet to provide instant delivery of just-in-time information on a 24/7 basis, users now expect information to always be there. They leave even less time than they used to for finding information. And as consumers become ever more accustomed to paying for time-saving convenience to help them deal with other aspects of their lives, it’s reasonable to think that they may pay for the convenience of information that can be delivered instantly, with no muss or fuss.
Are all these changes good for searchers?
Ask any librarian: search engines have opened a door to the world’s information that simply didn’t exist a decade ago. And search engines have enabled access to much of that high quality information – like government publications, statistics, health information, position papers, homework helpers, curriculum support materials, templates, forms, and numerous other resources that have helped people improve their lives.
The “price” for finding that information has been the occasional ad, and most people accept that in order to provide a valuable free service, some economic benefit has to accrue to the provider. Up to now, these ads have existed predominantly in the background of commercial tools, but that is shifting, with search engine ad presence expected to increase substantially every year for the next several years. Although the Internet will still be the source of much excellent high quality, enriching information, it is also about to cement its position as yet another mass medium – one in which advertising and sponsorship extends far beyond the periphery and infuses the actual content.