Types of search engines
There are mainly two types of search engines:
-
Index created by search crawler or spider: the search spider visits a webpage, reads it,
and then follows links to other pages within the site. Everything the spider finds goes in
the index.
-
Directories created by person, listing your page to an appropriate category.
Search crawler often provide up-to-date results, while Directories provide much more
targeted results.
How to indicate search engine spiders which part of your website should not be accessed
You can use robot.txt to control search spiders, the content of the file should be like:
User-agent:*
Disallow:/yourFileName
For example, if you don't want "google" to access "./images/flower.gif" and all files under
"./tmp" directory, you may use following:
User-agent:Googlebot
Disallow:/images/flower.gif
Disallow:/tmp/
For more information, see The Robot Exclusion Standard
How to indicate search engine spiders not keep a cached version of your page
You can use META tag to control search spiders,
Place this in the
section of your documents:
<META NAME="ROBOTS" CONTENT="NOARCHIVE">
How to use META tag to promote your site
Most search engines use META tags as one component of their ranking formulas.
Following are some META tag you may need to add to your page:
- Description: META Tags Describe Your Content.
- Keywords: META Tags include your key words.
- Copyright: The copyright date of the page.
- Rating: Use this META tag to rate your page.
- Robots: tell the robot the path of crawl
- Revisit-after: tell the robot after how many days to revisit your page.