Major Internet Retrieval Tools

Maintained by Heting Chu. Last update on September 2, 2009


1. Singular Internet Retrieval Tools
2. Meta Internet Retrieval Tools
3. Retrieval Tools for Specific Types of Information
4. Retrieval Tools for Images and Videos
5. Retrieval Tools for Blogs, News, or Subject-Specific Information
6. Retrieval Tools for Non-Web Materials

1. Singular Internet Retrieval Tools

1.1A. Focus on searching (The Big Four)

Google http://www.google.com

Created by Larry Page and Sergey Brin, the beta version of Google was released in February 1999. It presents a most relevant site for any searches based on its ranking algorithm with its "I'm feeling lucky" button. The relevance of a site is measured by, among other things, examining how many other sites are pointing to it and their importance. It has been rated as the top search engine in recent years. At present, it is said to conduct more than 60 percent of all Internet searches with regular introduction of new services. Other services and products Google offers include Google Book Search and Google Scholar. Google Labs showcases what Google is experimenting with.

Yahoo! http://www.yahoo.com

Yahoo! was the best known and most popular directory service before it evolves into an Internet portal. Tens of thousands of Internet sites are listed within an easy-to-use, comprehensive subject hierarchy or taxonomy. The lists feature short descriptions of the sites. Yahoo! is one of the best places to start locating information on the Internet. Its search service also develops over the years. Y!Q Beta for contextual search and Creative Commons Search Beta are two of the many search services Yahoo! provides. Yahoo! also participates in the Open Content Alliance project. On the other hand, Microsoft will license Yahoo's search technologies if the deal these two companies made in July 2009 is given regulatory approval.

Bing http://www.bing.com

Bing, unveiled in May 2009, is the third search service Micrisoft developed after its efforts on its two predecessors - Live Search and MSN Search. Bing is gaining popularity thanks to its enhanced capabilities in this implementation. It offers some unique features such as preview of search results in a pop-up window. In addition, Bing will power Yahoo! Search if the deal these two companies made in July 2009 is given regulatory approval.

Ask http://www.ask.com

Made its debut on the Web in 1998 and known as Ask Jeeves in the past, Ask is the first search tool that allows users to enter a question the way it is (e.g., Why is the sky blue?) as a search query although its initial setup was based on questions and answers developed by human beings. It has experienced ups and downs since its establishment. It also hosts a search site designed specifically for kids at http://www.askforkids.com.

1.1B. Focused on searching (The Old Magnificent Seven)

AltaVista http://www.altavista.com

It was developed at the Digital's Research Laboratories in California, and delivered to the Web on December 15, 1995. It received favorable reviews for years before other major search engines came out. It is the first site that introduced the language translation feature. It has been one of the major search tools in its kind. It switched to Yahoo!'s database in March 2004.

Excite http://www.excite.com

Excite was initially a multi-purpose site from a company named Architect. It used to support concept searching. Excite also suggested terms for search modification, which was called "Zoom In". It ceased functioning as an individual search engine in December 2001, and becomes a property of Mindspark Interactive Network, an IAC comany.

HotBot - RIP

HotBot was owned by Lycos. But, it had simply become an interface for tools such as Ask.com since February 2005. As a newcomer from the University of California in the mid 1990s, it used to be a comprehensive search engine without sacrificing speed. It also provided a variety of options for limiting searches then.

Go Network (Infoseek) http://www.go.com

Infoseek offered both free and subscription-based services in the beginning. It was voted by PC Computing magazine's editors as the most valuable search tool in 1995. Purchased by the Go Network and subsequently other companies, it later becomes a search tool with few distinctive features. The searches at its site are now powered by Yahoo!.

Lycos http://www.lycos.com

Lycos used to be a huge search tool. It has undergone great changes in the past years. One weak spot of Lycos remains to be that it indexes only part of the sites it covers. Sponsored sites are listed before search results, which would effectively scare its users away.

Open Text - RIP

Open Text was a major Web-based search tool. Later, it became a site for business information under a new name Livelink Pinstripe, and then BusinessWeb. Now it disappeared entirely from the Internet as a retrieval tool.

WebCrawler http://webcrawler.com

WebCrawler used to be a singular search engine before becoming a meta-search tool based on Google, Yahoo!, Windows Live, Ask and other popular search engines. It has been keeping a low profile since the very beginning. Yet, it stays in the very competitive field of information retrieval on the Internet. Since January 1999, it became part of the Excite@Home portal. Now, it is owned by InfoSpace as a meta search tool.

1.1C. Focused on searching (Search Boomers)

AlltheWeb (Formerly known as Fast) http://www.alltheweb.com

Introduced to the public in May 1999, the main features of AlltheWeb in the past included size, speed and search filters. Multimedia searching and other features were added later. It was physically located in Norway. However, it switched to Yahoo!'s database after 2003, and is becoming a Yahoo! company with few of its own identities left.

Exalead http://www.exalead.com/search

Geographically located in France, Exalead was created in 2000 with a host of features that are usually not available from other search sites. Those features include phonetic search and approximate spelling. It is gaining reputation among Internet searchers. Its coverage of European resources appears to be better than other tools based in America.

Gigablast http://www.gigablast.com

Making its debut in the giant field of Internet searching, Gigablast gained fame graduately by introducing natural lanaguage searching (Ask a question) and suggesting related terms or concepts for further search with its Giga Bits. While the latter is still supported, the former has been removed. Overall, Gigablast seems to stop being a major player in Web searching.

mozDex http://www.mozdex.com

Lauched its beta version in early spring of 2004, it uses open source search technologies to operate an open and fair search engine. It also indicates that data from Dmoz, aka Open Directory, offers the resources for starting mozDex. The Explain link after each search result lists the weighting algorithm in addition to other information. However, it has not been working since August 2008 according to a message displayed at its site when one tries a search, "We are doing some updates to our search engine. Please come back later!"

1.2. Focused on browsing

About.com http://about.com

About.com includes over 470 highly targeted topics (e.g., "Africa Travel", "Search Engines"), each overseen by a professional guide. It normally displays the full name and photo of the guide at the homepage of the topic s/he is responsible for. It was acquired by The New York Times Company in 2005.

Galaxy http://www.galaxy.com

Formerly known as EINet Galaxy, it was one of the first browsing tools available and one of the largest. It provides a well organized and easy-to-browse topical arrangement to a wide variety of Internet resources. Since 2006, it has been focusing on classified and directories using a search mechanism.

Open Directory http://dmoz.org

It is a volunteer-based project as well as self-regulating republic where experts collect recommendations, excluding noise and misinformation, for the directory. Its goal is to produce the most comprehensive directory of the Web, by relying on a vast army of volunteer editors. All the volunteer editors deserve our respect and gratitude for their time and efforts on the Open Directory project, especially in this commercial society.

qbsearch.com (Formerly QuickBrowse) http://www.qbsearch.com/

Its predecessor, QuickBrowse, was created by a freelance journalist, Marc Fest, to invent a way by literally "stitching together" multiple webpages for faster viewing. More than 16 various retrieval tools were listed for selection but final results were not merged in any way. Currently, it becomes a directory service, focusing on telecommunications resources. qbsearch also lists sites for locating people and other types of resources.

1.3. Alternative tools

Scour (Previously called AfterVote) http://www.scour.com/

When acquired by Internext media in 2008, AfterVote became Scour. Strictly speaking, Scour is a ranking facility for search results retrieved from major tools rather than a search engine. It lets users conduct a search using Google, Yahoo! Bing, and OneRiot, and then vote/comment on what has been retrieved.

ChaCha http://search.chacha.com

Initially labeled as People Powered Search, ChaCha has re-introduced the search intermediary dimension into the search arena. Free human search guides are available to help just a click away via live chat. Be aware that you have to visit search.chacha.com, the so called "Classic Chacha", to start using it while www.chacha.com only displays information about Chacha itself. At present, it is moving more towards mobile searching.

Hakia http://hakia.com

Still in Beta, Hakia is claimed to be a semantic, meaning-based search engine. The benefits Hakia enumerated on its website include meaningful categorization of search results (i.e., galleries) and differentiation of homonyms. It also can take questions and sentences as queries in addition to the common phrases and keywords.

Quintura http://www.quintura.com/

Quintura presents its search results two ways in two parallel windows. One is the traditional list and the other is a tag cloud - a display that shows the relationships among various terms involved when one places the cursor on a term. This feature is what makes Quintura an alternative retrieval tool. Its equivalent for kids, Quintura Kids, is built on the same concept.

Top 100 Alternative Search Engines

Charles Knight and his colleagues have started compiling a list of top 100 alternative search engines irregularly since January 2007. Here is a list posted in June 2009. Although the term "alternative" is defined broadly and loosely on this list, it does open a new window for viewing the recent developments in Internet searching.

1.4. Knowledge/Semantic-based engines

WolframAlpha http://www.wolframalpha.com

Labeling itself as computational knowledge engine, WolframAlpha was formally launched in May 2009. This site, unlike regular search engines, is built on a knowledge base "curated" at Wolfram Research which was founded by Stephen Wolfram in 1987. The search results WolframAlpha presents include tables, figures, and the like computed from objective data (e.g., facts and published knowledge) in its knowledge base. Because of the afore-described features, some people in the field of Internet searching call it "fact engine". It represents a new way of finding information, different from the regular search engines we are familiar with. Google, on the other hand, announced its product Google Squared shortly afterwards to "steal WolframAlpha’s thunder". A video was also released to describe what Google Squared is.

Yebol http://yebol.com/

Yebol entered the Internet search market in summer 2009, claiming to be the most advanced search engine on the Internet in that it provides on one single page Top Sites, Related Topics, News, Videos, Images, and more besides the usual search results. Two short videos are loaded at Yebol's homepage to explain what Yebol is and how it differs from such major search engines as Google, Yahoo! and Bing. Yebol also promises that more features will be implemented when it comes out of beta at the end of 2009.


2. Meta Internet Retrieval Tools

2.1. Collective listing, individual searches

iTools! http://www.iTools.com

iTools!, which stands for Internet Tools!, is a collection of Internet retrieval tools listed under several categories such as search tools, language tools, and research tools. It also includes a number of not-so-common tools (e.g., Link Popularity Check).

Zuula http://zuula.com/

Zuula has come into the search world since 2005. It serves as a gateway to searches in areas such as Web, images, news, blog and jobs. Several individual tools are chosen for each area while the searcher is given the option of changing preferences, including the order of selected search engines. Note that the searcher has to conduct a search in a given area in order to see what the chosen individual tools are.

2.2. Simultaneous searches, integrated results

Beaucoup http://www.beaucoup.com

Beaucoup is a site established by one person, Teri Madden. The site claims to cover more than 2,500 retrieval tools available on the Internet in its directory besides serving as a meta search engine. In the latter case, it can query 10 search sites simultaneously. Sponsored results, sometimes very lengthy, are listed before any other results.

Clusty http://clusty.com

Derived from Vivisimo in September 2004, Clusty inherited the clustering technology and became an independent meta search engine that can categorize results into clusters. Specifically, search results from different individual retrieval sites are automatically grouped into clusters at its side bar. This clustering feature distinctively separated Clusty from other meta search engines that time. Vivisimo focuses on providing fee-based search services for enterprises while Clusty is devoted to free Internet searching.

Dogpile http://www.dogpile.com

Unlike its past practices, Dogpile now sends queries only to the Big Four (i.e., Google, Yahoo!, Bing, and Ask). Users can specify search preferences with regard to filter, language, display of results, and the like. Dogpile can serve as a good start point for Internet retrieval tasks. It is currently part of InfoSpace.

Ixquick http://www.ixquick.com

Compared with other meta tools, Ixquick is relatively new in Internet retrieval. However, it brought some features that other counterparts did not support then. For example, it supports multilingual search in 18 languages. Choices of individual tools are available only after a search is done. In addition, ranking information from individual tools is provided next to each result. Another new feature Ixquick highlights is that it no longer stores searchers' IP addresses, for which it labels itself as "the world's most private search engine".

KartOO http://www.kartoo.com/en_index.htm

Calling itself a visual metasearch engine, it presents retrieved results graphically. That is the most unique feature of KartOO. The relationship between one site to others can be visualized by moving the pointer over a result visual. Kartoo recently changed its interface to force the searcher, after entering a query, to view its promotion page. The same query has to be entered again at the promotion page in order to conduct a search, which does not seem to make any sense to a serious searcher.

Mamma http://www.mamma.com

Mamma, claiming to be "the mother of all search engines", was created in 1996 in Montreal, Canada. The various search options (e.g., individual search tool selection) it used to offer are now all gone. This would have a negative impact on its competitiveness in the meta searching market. It acquired Copernic, a search company based in Quebec City, Canada, in December 2005 and lists itself as a Copernic company.

MetaCrawler http://www.metacrawler.com

Originally developed by Erik Selberg and Oren Etzioni from the University of Washington, MetaCrawler supported unique meta search and other features. For example, it provided "MetaSpy" which displayed real-time queries of MetaCrawler users. However, it is now simply one more InfoSpace search tool, possessing the same features all other InfoSpace meta search engines (e.g., Dogpile and WebCrawler) have.

Search.com (Formerly SavvySearch) http://www.search.com

Before becoming Search.com, SavvySearch covered more than 200 search sites while searching only up to 5 at a time. In addition, it listed major search categories from which users could choose and narrow down a search. Search.com, now as part of CBS Interactive, only sends queries to Google, Ask, Bing, and Open Directory. Like many other retrieval tools, it sets up tabs (e.g., images, reference) for more specific searches.

2.3. For Comparison Purposes

Bing vs Google http://bing-vs-google.com/

After a query is entered in the search box, this site would list results retrieved from Bing and Google side by side. All the bells and whistles of each search engine are shown without any modification.

BlindSearch http://blindsearch.fejus.com

BlindSearch, as its name suggests, presents search results from Google, Yahoo! and Bing in three columns at first without revealing their identity. Once the user determines which column of results best fits his/her search query by clicking on the "vote for this search engine" button displayed on top of each column, BlindSearch would then reveal the name of each search engine that produces the results. Obviously, BlindSearch incorporates the "social search" component. On the other hand, it does not provide anything else from each search engine except results.

Google vs Yahoo!

This site allows users to enter a query, and then presents a graph showing search results that are found in both Google and Yahoo!. Each pair of results is connected with a line while their URL can be seen when one places the pointer on either dot at the end of the connecting line.


3. Retrieval Tools for Specific Types of Information

3.1. Email/mail addresses, phone numbers, social networks, & more

123people http://www.123people.com

123people aggregates many different kinds of informaiton about people, and displays them in one setting. What 123people provides virtually covers everything it can locate on the Internet about the target person: from email to snail mail addresses, from photos of the target person to that of his/her "friends" on social networks, from phone number to Web links, from blogs to twitters. The currentness of what it locates is, however, defined by the sources it uses. On the other hand, most personal information nowadays is only available from fee-based services such as US Search.

PhoneNumber.com http://www.phonenumber.com/

PhoneNumber.com is a tool for online directory information. In addition to providing traditional white and yellow page listings, it also supports reverse phone number and address searches. The same kind of services is also available for locating area, zip, and country codes. However, what was available for free in the past now becomes fee-based at this site.

Superpages http://www.superpages.com/

Known as InfoSpace previously, Superpages is a later comer compared with other major retrieval tools in this category. Yet it offered quality and comprehensive services for locating information about individuals and business. Like many of its counterparts, however, it at present automatically directs users to fee-based services if the searcher wishes to get more information beyond name and a couple of other basic elements. Superpages is now owned by Idearac Media.


3.2. Maps, driving directions & more

America's Byways http://www.byways.org

As part of the U.S. Department of Transportation, Federal Highway Administration, the National Scenic Byways Program created this site in an effort to help recognize, preserve and enhance selected roads throughout the United States based on one or more archeological, cultural, historic, natural, recreational and scenic qualities. Those roads are otherwise not commonly known to the public. Browsing by state using the interactive map seems a better retrieval approach if one does not know which byway to search for.

Google Earth http://earth.google.com/

According to Google, Google Earth combines satellite imagery, maps and the power of Google Search to put the world's geographic information at your fingertips. One needs to download and install the free Google Earth software before being able to use it. Recently, Google Earth has added galaxies in outer space into its system besides what it initially offered. Google Mars is one example of such implementation which takes users beyond the planet we live on.

Google Maps maps.google.com

Google Maps came out in 2005 with typical map related search mechanism. In addition, it offers draggable maps, satellite images of the location one is searching for, and a few other nice features. For example, you can try My Maps to create and share personalized, annotated maps of your world. Google Maps also provides street views of the location being searched although this practice sometimes causes controversy. A good number of people have built their services on Google Maps. gCensus is one such service, which presents US census data for an area you have clicked on Google Maps.

MapQuest http://www.mapquest.com

It has been offering quality services of maps, driving directions, road planners, and more since mid 1990s. However, like its counterpart MapBlast which already disappeared from the Internet, MasQuest is clearly loosing its competitiveness in this rapidly evolving arena. One new feature MapQuest now supports is Gas Prices, based on which the searcher is able to find out the cheapest gas at a target location.

MSN Maps & Directions http://mappoint.msn.com

Microsoft purchased MapBlast, a poineer service in its kind, and turned it into MSN Maps & Directions. It lets you type in a street address and city name, then presented you with driving directions and a detailed interactive map of the area in addition to a lot of other commercial information like shops and restaurants. You can pan or zoom the map to get a better look. It now seems being abandoned while the map service offered by Bing at http://maps.bing.com takes its place. It has some nifty features such as viewing a given map in 2D or 3D.

Rand McNally - Maps, Driving Directions & More http://www.randmcnally.com

Based on its quality printed products and services, Ran McNally offers comparable map and driving direction services on the net free of charge.

Yahoo! Maps http://maps.yahoo.com

Yahoo! has recently redesigned its map service. Besides offering standard maps and driving directions, Yahoo! Maps also provides live traffic information such as road work and traffic conditions. In addition, it automatically displays the geographical location you are currently at when it is launched.


4. Retrieval Tools for Images and Videos

4.1. Description-based image retrieval

AltaVista Image Search http://www.altavista.com/image/default

Introduced by AltaVista not long after the company went public. It exclusively covers image information.

Flickr http://www.flickr.com

As a Yahoo! company, Flickr claims that it is the best online photo management and sharing application in the world. Flickr is rapidly growing thanks to its Web 2.0 capabilities (e.g., tagging).

Google Image Search http://images.google.com

Google introduced image searching to its site later than other major players in the domain, but claims being the most comprehensive image search tool on the Web. Due to Google's reputation in text retrieval, its image search service is rated high as well. In addition, quite a few options for refining searches are available on the Advanced Image Search interface.

Picsearch http://www.picsearch.com

Picsearch is a search engine devoted to images. The options listed at its Advanced Search (e.g., images or animations, color or black & white) enable the user to fine-tune search results.

4.2. Description-based video retrieval

Blinkx http://blinkx.com/

Blinkx is one of the major players in retrieving videos. One can either search videos by keyword or simply browse video categories or thumbnail images.

YouTube http://www.youtube.com/

YouTube has become the best known video retrieval tool after Google acquired it. It provides a rich source of videos of all kinds. Moreover, it sets up a mechanism that enables users to share, evaluate videos of interest.

4.3. Description & content-based image retrieval

Bing Images http://www.bing.com/images

Microsoft started support content-based image retrieval at Live.com, Bing's predecessor. The user need to conduct a description-based image search before being able to use the content-based method. To be specific, the user places the pointer on any image located via the description-based approach and the "Show similar images" link would appear. Clicking on that link would lead to more images similar to the one just displayed. What underlies this mechanism is the content-based approach.


5. Retrieval Tools for Blogs, News, or Subject-Specific Information

5.1. Blogs, Twitters, & more

Bloggernity http://www.bloggernity.com

Bloggernity is a blogger search directory that allows one to either search or browse blogs at the blog or category level. This site, on the other hand, appears very commercial.

OneRiot http://www.oneriot.com

Labeling itself as realtime search engine, OneRiot provides access to what people share on Twitter, Digg and other social sharing services in seconds. This is the site to visit if one looks for realtime information. But keep in mind that no mechanism is in place for evaluating the quality of such information. Unlike the majority of other search engines, however, the font size OneRiot uses is about 18 points, which is good to searchers' eyes.

Tweepz http://www.tweepz.com

Exclusively devoted to Twitter searching, Tweepz lets searchers locate what they want using Twitter related properties. For example, one can search Tweepz by "followers", "following" and "updates". At its "advanced search", Tweepz supports searching by a specific text in the bio, a specific name, and more. Tweepz is powered by Exalead, a leading search company initially founded in France.

5.2. News

SurfWax http://news.surfwax.com

Like the surf wax that helps surfers grip their surfboard, SurfWax attempts to help websurfers get the best grip on news information from the Web. It also claims to be able to gather news information from 4,000 sources. It used to display result while one is typing a query. Currently, this feature is changed into query suggestions.

5.3. Subject-Specific

GEM http://www.thegateway.org

GEM (Gateway to Educational Materials), initially funded by the National Library of Education and the U.S. Department of Education, evolves into a site more than a center for educational resources. It also assumes many other functions as specified at its homepage. The quality and quantity of GEM resources, nevertheless, remain as good as before.

Healia http://www.healia.com/healia/

Healia came into existence not long ago. What distinguished Healia from other search tools for health information used to be its ability to zero in retrieved results by pre-defined tabs (e.g., prevention, symptoms, and treatment). But, Healia now only provides filters for gender, age, and heritage (i.e., race).

Intute http://www.intute.ac.uk/

Intute, a UK-based service, provides quality Web resources for education and research in many disciplines (e.g., biological sciences, engineering, and law). Visit its homepage for a complete list of disciplines Intute covers. It supports both browsing and searching.

PubMed http://www.ncbi.nlm.nih.gov/sites/entrez

It was developed by the National Center for Biotechnology Information (NCBI) at the National Library of Medicine (NLM) for providing access to citations from biomedical literature. The information retrieved from this site is peer reviewed and authoritative. Some of the fulltexts can be obtained free of charge.


6. Retrieval Tools for Non-Web Materials

Lists USENET Groups

Maintained by Heting Chu. Last update on September 2, 2009