
Previous: How Search Engines WorkIf you have not already read about how search engines work, you might want to do so now. Understand how search engines work in general before we discuss the differences between how deep search engines rank web pages and how directories rank web pages. (By the way, if you missed our discussion of the difference between deep search engines and directories, read it now).
How Deep Search Engines Rank Web Pages
All deep search engines use a robot, spider or crawler (different words, same function) to build and maintain their databases. Spiders automatically visit and index each web page after it is submitted to them. They also often follow any links on the submitted page to discover and index additional pages, thereby going "deep" into the web site and beyond (if any of the links go to other web sites). Hence the name "deep" search engines. One defining result of this deep approach is that these databases usually include separate entries for each web page within a site. Consequently, they are the largest and most detailed summaries of the WWW.
Spiders typically "read" the entire web page, or whatever portion its creators deemed most important (the top third of the page, for example). While the elements that are analyzed vary among individual deep search engines, some of the most commonly examined characteristics of a web page include its:
- Title
- Keyword Density
- Emphasized Text
- Description and Keywords META Tags
- URL (Uniform Resource Locator)
- Popularity (# of links to the page)
From this automatic scan, an entry is created and stored in the search engine's database. Sometimes this entry includes all the text on the web page. Some spiders choose to ignore META tags; others exclude URLs from their database. We will look carefully at each of these web page elements when we learn about how to prepare web pages for deep search engines.
Deep search engines are far less selective in building their databases than directories are. Since the decision to include or exclude specific web pages is made by a robot, every submitted web page is automatically accepted, provided it does not employ any obviously underhanded tricks to artifically boost its relevancy (these shady techniques are collectively referred to as "spamming"). The people who program the robots make sure to incorporate spam detectors. Pages that use known spam techniques are either excluded from the index entirely, or penalized later by the ranking algorithm. The development of spam detectors to counter the latest spam techniques on the web is very much like the invention of police radar guns to outpace the availability of radar detectors for automobiles.
Each deep search engine also features a unique ranking algorithm. Remember, ranking algorithms only spring into action after the database is created and once a search is initiated. They search the database that exists at that moment for matches to the key words entered by the surfer. Different algorithms assign different weights to the elements of each web page in their particular database. Most deep search engine algorithms place the greatest emphasis on titles and keyword densities.
How Directories Rank Web Pages
Directories differ dramatically from deep search engines in how they create their databases. Directories:
- employ humans to visit each submitted web site and decide whether or not to include the candidate in their database;
- typically allow only one database entry per web site (usually the home page), instead of one entry for every web page;
- rely on a more detailed submission procedure to capture information for their databases;
- do not send robots out to scan web pages.
It follows that this approach includes more human acceptance criteria, such as logical, attractive and informative design. Databases are much smaller and contain mostly high quality web sites (Yahoo, for example, has about 750,000 web sites in their database, as opposed to 50 or 100 million web pages in deep search engine databases). Directories also offer the additional ability to find web sites by browsing categories instead of typing in search phrases.
On the minus side, obscure topics are less likely to be located in directories. Acceptance criteria are more subjective, and submission procedures for webmasters are more time consuming. The time it takes for a submitted web site to appear in an online directory is also usually longer than that for a deep search engine, since the review process is not automated.
From a webmaster's point of view, the most important fact to keep in mind about directory databases is that they only contain the descriptive information provided during the submission process (typically a title, description and category). They do not contain any of the elements of web pages that deep search engine robots index, such as META tags, keyword density, etc. For deep search engines, the design of your web page - not the submission procedure - determines your ranks. The exact opposite is true for directories: it is the submission procedure - and not the design of your page - that will determine your positions.
Of course, the design of your web page IS important for directories, too, because a human being will be visiting your site and deciding whether or not to include it in their database in the first place. However, the design considerations for directories are more aesthetic (i.e. logical and attractive presentation) than technical (META tags, keyword density, etc). Effective web page design must take into account both aesthetic and technical characteristics in order to be included and ranked well in directories and deep search engines.