The Google Alternative
by Patrick Clapp
My homepage has been set to Google for years. The page loads quickly; other than the logo it contains no imagery; and it free from distracting ads. I have been studying library science for eight months and the cries of foul about Google were heard from day one. It is an acceptable source for data, a poor source for information, and a retched source for intelligence. That said, it is a great place to start on almost any project … not sure of spelling? Want to get a quick splash of companies associated with the target of your inquest? There are a few strengths found in Google that contribute directly to it being my homepage. If it the page fails to load, I probably have lost net access locally, if I have forgotten Mother’s Day the logo artist will remind me. More importantly, Google is a great place if you already know where you want to go.
A professor of mine described a Google search as getting into a car with opaque windows, allowing the vehicle to drive you for a while before hopping out and taking a look around. You have arrived somewhere, it might be where you want to be, but you have no idea how you got there. There are more transparent methods of finding on the web. Finding what, though? The life-cycle of what most people call (inaccurately) information is made of three major parts: data, information, and intelligence. Data abounds; it is everywhere, and we are bombarded constantly with it. Data is raw, unprocessed, and unanchored. Anyone can find data; however no one can use data until it has been transformed into information. Information is data which has been organized. Someone has to take the price of movie tickets in Boston and the price of movie tickets in New York City and perform some task of organization before developing a packet of information such as: NYC movie tickets are, on average, $1 more expensive than those in Boston. The ticket prices are the data, the results of comparing the data is the information. Information, when analyzed, becomes intelligence. "The intelligence process generates insightful recommendations regarding future events for decision makers rather than generating reports to justify past decisions." (Millennium Intelligence, Jerry Miller, 2000)
But how does this relate to a Googling alternative? Understanding the journey and the fundamental differences in what you are collecting versus what you would like to collect is a start. There are three alternatives to the ad-heavy shotgun approach found through Google: Clustering, Meta-searching, and Invisible Resources.
Clustering is a method of near neighbor organization that groups search results by category enabling the end user to be more focused in the results that interest or do not interest them. A white paper on the reasons behind the need for clustering attributes issues with modern day searching, not with information overload, but with information overlook. We tune out so much data that we miss what we did not want to miss.
Vivisimo is an excellent clustering search engine with a clean interface and usable frames. Navigation between categories is straight forward, and pages can be loaded within the frame of the page for the user that does not wish to travel far from their original search window. The bottom of the page contains transparent reports on the sources queried, and categories are collapsible. The site also contains a series of white papers on the subject of clustered searching for anyone interested. Clustering takes the first massive step towards the conversion of web data into information; organization. The next few steps are then vet, organize, repeat.
Meta-searching, as found at sites such as Profusion, is also known as Federated Searching. A federated search scans multiple databases and search engines in one search, weeds out duplicates, and returns results based on relevancy. There are some misconceptions about how perfect the process is toward these goals. Beyond the crowd that shouts beware, there is a transparency to sites such as Profusion that go past the black box that is Google. A critical element in the evaluation of online sources is transparency to the end user. We live in a society that is pushing ever closer to full disclosure. The masses desire data, they would like to make informed decisions (intelligence), but to do so they need actionable information – derived from reputable data sources. Furthermore, I like the features on Profusion. The first time you use it you are hit with a barrage of highlighting (which can be disabled), graphical progress meters, and a variety of options in the categories of alerts and sorting. Meta-searching is an alternative to Google because it attempts to lift the black box and usually does it without as many targeted ads.
One of the secrets of success for librarians everywhere is the understanding that engines such as Google, Vivisimo, and Profusion cannot penetrate the Invisible Web. Invisible resources sound far more interesting than they are in reality, but that makes them no less valuable as a data source. These are the nooks and crannies into which the spiders of the internet cannot crawl. The Invisible Web and The Librarians' Index to the Internet are centralized resources for many sources that the search spiders cannot find. These sites are vetted by professional librarians and are organized by category. They cover a wide variety of topics, are presented though a simple and clean interface, and have connections to a wealth of data. The short-coming that modern search engines are limited in what they can find is not realized by most people. Hidden web resources contain a delicious trove of treasure plundered by librarians and other information professions.
Many people outside of the information professions believe that everything can be found for free on the web. This is a horrid assumption. The majority of the good data – waiting patiently to be organized into good information – which may someday be analyzed into good intelligence – is stored behind subscriptions and pay access. There is a cornucopia of useful data out there, however, and Google may not be the easiest method to find it. Hopefully one of the alternatives I have mentioned here may shorten your journey. If you are still banging your head against a virtual wall (or a real one) call your local librarian. If they tell you to try a Google search, drop me a line, I have a backup plan.