Searching the Internet
Huh, I had forgotten that this page exists...but somehow 3000+ visitors
have sneaked in behind my back :o} Now it's updated.
Talking about
Search Engines
These robots or spiders search every document of wich they get to know
the URL of and index the contents in a huge database. Normally you can
type one or more keywords in a form, separated by spaces, and after confirming
you get a list of documents that contain these keywords. Your choice of
keywords is called a query.
The Art of Searching
The art of searching is to use keywords that lead to just the information
you desire, and not to severel 10thousands of unrelated documents.
Suppose for a moment you search the recipe for Beef Stroganoff.
(Just for an example. I know your mother's Beef Stroganoff is the best
anyway.)
Were assuming you use the search engine Alta
Vista in the example.
Probably your first thought would be to enter "Beef Stroganoff recipe"
and hit "Search". You would find 578.377 matches, and learn all about the
argentinian meat industry, get to know Igor Stroganoff, (janitor of the
lavatories in the St. Petersburg Ermitáge), and find a fascinating
recipe for steamed mealworms in white wine - until you finally disconnect
and prepare the delicious tofu hamburger Stroganoff with catsup
that you came across early in your five hour session.
First thing to try is enter
stroganoff
in the search engine. Why not "Beef Stroganoff" ? Well, some
people call it Beef, but it's really called Boef and sometimes
Filet, etc. etc ... So you see that Beef would unnecessarily
narrow your options. Plus people who call it Beef are likely to use catsup
also...
As a result you will probably get 3367 entries. Restaurant adds, the
homepage of Igor Stroganoff (janitor of the lavatories in the St. Petersburg
Ermitáge), and recipes for "Tofu Hamburger Stroganoff"
So the next step, instead of starting to read Igor's homepage, is to
think of something that will always turn up in a recipe, but never in a
homepage or in a restaurant add, and include this in your query.
A good choice would be salt. Now, if in Alta Vista you
were looking for "stroganoff salt" you would get every page with
stroganoff in it, plus every page with salt. To tell Alta
Vista that both words must be on the page, search for
+stroganoff +salt
This will still include the tofu nerds. So to ensure you get only delicious
Beef Stroganoff recipes, make this
+stroganoff +salt -healthy
;o)
the "-" dismisses every page that has the word "healthy" on it...
This is a very important option. It is very likely, for instance, that
there exists a secret_plan.txt for a cloaking device for steam
driven spaceships that several russian scientists currently work on. Among
these are Dr. Igor Stroganoff (former janitor of the lavatories
in the St. Petersburg Ermitáge), and his wife Danuta Salt-Stroganowa.
Since they are confined to a secret gulag, none of them is healthy.
The interest in the scientific community has been immense and therefore
the plans are mirrored on hundreds of internet servers. Unfortunately these
are the first 400 matches you get for your recipe search. To get rid of
them, you can add -secret_plan to your query.
Ok, this will not happen to you, but eg. there are some pea counting
scientists who assemble dumb word lists that are actually mirrored everywhere.
Another important option is searching for an exact phrase. Figure
you want to find out about the cloaking device and only know that a Dr.
Stroganoff is involved. Now you will learn to hate Beef Stroganoff.
To search only for the Dr. (former janitor of the lavatories in the St.
Petersburg Ermitáge), enclose the name in quotation marks:
"Igor Stroganoff"
This tells Alta Vista to look for the exact phrase, including capitals.
These are the first things you should find out about any search engine:
-
how to make a keyword mandatory ("+")
-
how to exclude a keyword ("-")
-
how to search an exact phrase.
Most search engines support "Boolean" queries. That means to use AND, OR
and NOT (or "!"). The ultimate professional level, high quality, definitive
query for the recipe could look like this then:
stroganoff AND salt AND NOT(Igor OR Danuta OR (cloaking
AND device) OR healthy)
A final, very important tip: Don't try to find anything
in your mother language unless it happens to be English...
The Search Engines
-
HotBot
This is what I use regularly. Has the most up-to-date documents in
its database. There is a feature to search for specific data (images, videos...)
or limiting the search to specific domains (Asia, Europe...) just by point
and click. To use a boolean query you have to explicitly choose that option
in the drop down list.
-
Alta
Vista
The best all-purpose search engine. Very up-to-date and comprehensive.
It uses the easy +/- syntax for inclusion / exclusion discussed above.
Recently a very useful feature "Refine search" has been added. Enter
a naive search (Beef Stroganoff recipe), click "Refine" and it
will actualy make a list of all the subjects this leads to (like tofu,
cloaking device, etc.) You can simply check phrases that you want
to exclude from or include in your search.
-
Inference Find
This is a meta search engine. It queries several search engines
and combines the results with some reasoning behind it. This will find
documents a single search engine may overlook, but it will also find unrelated
(useless) documents.
-
Lycos
A dinosaur. Has the most documents of any age in its database
-
Yahoo!
Best if you have no keywords but like to browse categories. Note that
this is not information research, but surfing :)
-
DejaNews
This is a usenet archive. It archives the usenet discussion
groups (newsgroups). You should search this if you don't seek general information,
but answers. It is very likely that someone has asked your question in
a newsgroup already. By default it searches only in new articles. Switch
to "Past articles" if you are not researching something very recent, because
these are many more articles.
Tip: to find the answers, and not only the desperate postings of guys
who share your problem but never got an answer, include the "word" Re:
in your query. This turns up in the header of any regular answer.
Archie "Gateways"
Archie is a service that maintains a database of files that you can get
via FTP. You can't search ftp sites with the web search engines. Different
archie servers know the files on different FTP servers. If you need a file
and know it by name, try one of the Archie gateways below. Each offers
various archie servers to query through a convenient form.
For Windows95 there is a very convenient freeware program that makes finding
files easy:
fpArchie
It looks and feels just like the Win95 find files dialog, but it searches
files on the internet instead of on your local disk. It also has built
in FTP to download the files.
And what about you ?
Are you yourself to be found on the web ? Having a page with no links to
it is like living in a house in the woods where no street leads to. The
easiest way to get a presence on the web is to submit your URL to any or
all of the search engines above. The search engine will then search your
URL and all documents linked to it in the following weeks. A convenient
way to submit your URL to a multitude of search engines is doing it by
Submit
It!
You should also add your URL to your email and newsgroup signature. You'ld
be surprised how nosey people are.
Theres a number:
Now, back to the main page
in hoc signo
vincimus
I am a secret add for Retriever.
Just hoping that people are really that nosy.
© Dirk Djuga
1996, 1998