26 - Refining Internet Searches, Part 1


Even if you’re comfortable using Internet search engines there may be ways you can become more efficient One aid to fast and precise Internet searching is to know how computers think. Entering a word into a search engine commands your computer to look for that particular word in one of two types of sources: proprietary databases or the World Wide Web.

Proprietary databases are maintained by the search engine companies. They contain information about Web sites but not the sites themselves. Web page creators register their sites with the various search engines databases if they want to encourage traffic to their site. Some search engine companies use Web bots, a type of software, to collect Web site data without requiring the owner to register.

Databases are divided hierarchically, like the table of contents in a book, and your computer must "drill down" from the top (general) to the bottom (specific) in its search. Let’s say you want some information about Earl Gray tea. The Yahoo database stores tea sites in the following manner: Society and Culture/Food and Drink/ Drinks and Drinking/Tea. Look at the number of Web references contained in each level. Society and Culture has over 100,000 pages, Food and Drink has about 5,500, Drinks and Drinking has 1,100, but Tea only has 30. The lower down you go in the hierarchy, the fewer pages you have to mess with but even 30 possible pages is still too many to quickly sort through. The question is, how can you get your search engine to provide you with about ten good choices?

If you enter the word, "tea," into the Yahoo search engine, it will call up about 2,000 Web pages that mention the word. Most of these have little or nothing to do with Earl Gray tea as a subject. What attracts the engine is the number of times the search word is used. So let's be very precise and use the words "Earl Gray". That ought to take us straight to a very specific tea site, right? Nope. That choice brings up 185,000 possible Web pages. Why? Because there are many people in the world named Earl Gray, and, unfortunately, most of them seem to have Web sites.

What's the solution? If you've ever watched the TV series, "Star Trek: The Next Generation, you've seen the character Captain Jean-Luc Picard talk to the Enterprise's computer when he wants a cup of tea. He chooses his words carefully; "Tea. Earl Gray. Hot." The science fiction computer works from the general to the specific, much like a real one. It begins at a high level in its database hierarchy and drills its way down to a specific item using the yes-no binary system. Does Captain Picard want coffee? No. Does he want wine? No. Does he want tea? Yes. Next question, the computer thinks to itself. Which type of tea? It then works it way through a list of all the possible teas in existence. Does Picard want Darjeeling tea? No. Does he want Irish Breakfast tea? No. And so on. It would be a long and tedious process for a human, but the computer performs it in a fraction of a second and gives the Captain his cup of hot Earl Gray tea.

Many search engines these days can handle common phrases such as, "I'm looking for information about Earl Gray tea" by focusing on the nouns in the phrase, but if you're having problems finding what you want try speaking to the computer in its own language. Use the set, "tea Earl Gray" and see what your computer finds for you.

Each defining word you add helps the computer be more accurate in its search. To see a history of the Alamo without bringing up the many business web sites that use the name, type the words, "history Alamo." Most engines will read that as," history AND Alamo". Word order is important to a computer. In this case, the computer brain will skip the Business category and go directly to the History option because that’s listed first. Options such as, "the Alamo Bowl, the history of," or "Los Alamos, the history of" could still confuse the computer so adding "Texas" refines the search even more. The words, "history Texas Alamo" brings up good sites about the Alamo mission and the battle of the same name.

If you're stuck, try more than one search engine. I keep four different ones bookmarked and I’ll try them all if I can’t easily find what I’m looking for. Somewhere on their home pages you should find a link to the advanced features instructions. Become familiar with them because each engines works in slightly different ways.


First published February 2002
Copyright 2002
Fred Askew