Skidmore logo Library logo

Internet Search Tools Explained


1) Search Engine indexes are compiled automatically.

2) Subject Directory indexes are compiled by people.

3) "Meta" search tools sample the databases of several search engines.

4) Some engines/directories allow you to search for particular Internet resources.

5) All search tools index only a portion of what is available on the Internet.

6) The way your results are ordered will differ from search tool to search tool.

Related Links


1) Search Engine indexes are compiled automatically.

Search engines, such as those linked from the Internet Search Tools page, use a software program, called a spider or crawler or bot, which contacts web pages and follows hypertext links from those pages to contact new pages. A second program creates an index from the information gathered by the spider, and a third allows you to search the index.

Many search engines, such as HotBot, AltaVista and Excite, provide hierarchically arranged subject directories (see #2 below), in addition to an interface for searching their web indexes.

When you use a search engine to "search the Internet," you are actually searching an index which has been previously compiled by the spider. You don't actually search the Web, you search an index of Web pages. Using a search engine is not much different than searching any other database. (See #3 of Internet Search Strategies.)

 

2) Subject Directory indexes are compiled by people.

Subject directories (catalogs/collections/guides/indexes/libraries), such as those linked from the Internet Search Tools page, are compiled by people. Editors or contributors find web pages which are arranged hierarchically by topic. Since people can't work as quickly as machines, the database of a directory is usually much smaller than the database of a true search engine, which is compiled automatically. (See #1 above.)

Some subject directories, such as Yahoo, default to a search engine's index, after consulting their own directory index.

When you use a subject directory to "search the Internet," you are actually searching an index which has been previously created by people. You don't actually search the Web, you search an index of Web pages. Using a subject directory is not much different than searching any other database. (See #3 of Internet Search Strategies.)

 

3) "Meta" search tools sample the databases of several search engines.

"Meta" search tools, such as those linked from the Internet Search Tools page, allow you to search the indexes of several search tools simultaneously. Those search tools can be engines or directories.

While in theory the meta engine should provide a very efficient and thorough search, there are some drawbacks. When you use a meta search engine it only takes a sample or percentage of the available results from each search tool's index. In addition, since each search tool requires specific search syntax, the queries may generate different results in one engine than in another. Another issue to consider is how efficiently the meta engine processes the results. There is bound to be some overlap in the indexes consulted, so using a meta engine that doesn't eliminate duplicate URLs or collate the results usefully may be a waste of your time.

Meta search tools are especially useful when you are researching topics for which not much material is (or seems to be) available.

 

4) Some engines/directories allow you to search for particular Internet resources.

The Internet includes several different kinds of resources, of which the World Web is only one. (See Internet/World Wide Web Concepts.)

If you are interested in finding material from a particular type of Internet resource, Usenet (message boards or newsgroups) for instance, you may be able to limit your search to those resources using a search engine, or there may be a specialized search tool available. Check the Internet Search Tools page for some possibilities, and make sure you take some time to explore the capabilities of the search tools you use.

 

5) All search tools index only a portion of what is available on the Internet.

Even the search engines with the largest indexes (see #1 above) are covering well below one half of the resources available on the Web.

According to a 1999 article in Nature by Lawrence and Giles, the estimated number of publicly accessible Web pages in February 1999 was 800 million. A year later, Search Engine Showdown estimated that the search engine with the largest database, Fast Search, indexes a bit over 300 million pages. Not even accounting for a year's worth of Internet growth, that's only 37.5%.

You will need to use several tools to conduct a reasonably thorough search. (See #5 of Internet Search Strategies.)

 

6) The way your results are ordered will differ from search tool to search tool.

Be aware that due to the size of the indexes of most search engines, the order in which your results are presented is important. You do not want to waste time wading through irrelevant results in order to find the "good hits."

The order in which a search engine gives search results can be influenced by sites (usually commercial sites) that use metatags (information in the HTML that provides information about the page, but is not displayed) and invisible text (text the same color as the background) to misrepresent content or increase the chances of the site being indexed and coming up in particular searches, and, in at least one case, GoTo.com, by the search engine, actually selling placement positions to advertisers.

Do some homework. The ability of a search tool to provide accurate and relevant results is just as important, if not more so, than the number of pages it indexes. The order in which the tools are listed on the Internet Search Tools page takes some of these issues into account.

 

If you have questions about evaluating Internet search tools, ask the Reference Librarian.

 

Back to Internet Searching
TOP

---------------------------------------------
Page maintained by: John Cosgrove
Lucy Scribner Library, Skidmore College
Last updated: October 11, 2002