03 - Search Engines Part 2


I've been misusing the term "search engine" in these articles and oversimplifying a very complex subject. In reality, there are many, many kinds of search engines and dozens of different ways to retrieve information from the Internet. Here's a very brief overview of some of the other possibilities. Follow up what interests you and ignore the rest. (If you want to know more about a particular engine, do a search!)

Search engines can be categorized as subject guides, true search engines, or metacrawlers.

Subject guides primarily search a database containing registered Web pages. Anyone who creates a Web site can register it with the subject guides, but, if no one registers the site, it won't show up. Each subject guide site offers slightly different information and even the same Web page may show up in numerous ways because of the methods each subject guide uses to index sites. Some common subject guides are Yahoo, Snap, LookSmart, and Magellen. Subject guides are most useful for superficial searches, the kind where you just need a quick fact or two about a common subject.

True search engines look at millions of Web pages. Of course, the more sites searched the longer it takes, the more returns you get, and the more time you use to sort through them. Many people consider Alta Vista to be the best all-round true search engine while others prefer Excite, Lycos, HotBot, Infoseek, or Northern Light. True search engines are most useful when you have a very specific subject that will only return a few Web sites for you to consider.

There are also search engines called metacrawlers or metasearchers. They run your query through several regular search engine sites for you. You only have to enter your question a single time. Unfortunately, metacrawlers tend to allow fewer search variables than other engines so they may return many more wrong and useless sites than a tighter search on a single engine. Some common metacrawlers are Dogpile, Mamma, Inference Find, SavvySearch, and Metacrawler.

Confused? It gets worse. There are also searchable news databases such as Yahoo! News and Infoseek News. There are specialized Usenet searchers such as Deja.com and RemarQ (Usenet is worldwide bulletin board system used by special-interest discussion groups). There are engines that are subject-specific, dealing only with sports, music, stocks, etc. There are natural-language engines that are very easy to use, like Google, C4, and Ask Jeeves. (Ask Jeeves is the best-known natural-language searcher. You can type complete sentences as if you were writing to a human rather than a computer. This is only partly successful since what the engine actually does is zero in on key words in your question and search for them. There's only a slight difference between asking Jeeves, "Who was Scipio Africanus?" or entering the words "Scipio Africanus," in another search engine.)

There are sites that require you to fill out an online registration form before you can use their database. Normally, this has no negative consequences, but there are a few sites that will sell your information to advertisers. Registration at those sites produces a flood of junk mail, both electronic and snail. Good sites publish the details of what they plan to do with your registration information. Read the terms before signing up.

Some sites let you look at the files in their online databases for free, but charge a fee if you want to print or download those files. Other sites charge a fee for any access at all to their databases. These tend to be very specialized subjects, such as law or medicine. If the subject is one for which you need large amounts of very detailed, hard-to-find information, the fee databases may be worth it to you.

Now the good news: you don't have to know all this before you can perform routine Internet searches. The World Wide Web changes every day, literally, and it's a full-time job to keep up with all the changes. Don't even try. Sure, it's nice, in general, to understand what's available, to see the kinds of things that are out there if you really want them, but there's a point quickly reached where learning minutia about the Internet actually gets in the way of your writing. Internet research knowledge is cumulative. Start with the basics, take baby steps, and learn only what you need to get the job done today. Remember that you're a writer, not a professional researcher.


First published March 2000
Copyright 2000
Fred Askew