Free Articles, Free Web Content, Reprint Articles
Saturday, June 2, 2012
 
Free Articles, Free Web Content, Reprint ArticlesRegisterAll CategoriesTop AuthorsSubmit Article (Article Submission)ContactSubscribe Free Articles, Free Web Content, Reprint Articles
ADVERTISEMENTS
 

The proper way to use the robot.txt file

When ... your web site most ... don’t consider using the ... file. This is a very ... file for your site. It let the spiders and crawlers know what they can and can not index

When optimizing your web site most webmasters don’t consider using the robot.txt file. This is a very important file for your site. It let the spiders and crawlers know what they can and can not index. This is helpful in keeping them out of folders that you do not want index like the admin or stats folder or content that they can not index.

Here is a list of variables that you can include in a robot.txt file and there meaning:

1)User-agent: In this field you can specify a specific robot to describe access policy for or a “*” for all robots more explained in example.
2)Disallow: In the field you specify the files and folders not to include in the crawl.
3)# the number sign represents comments

Here are some examples of a robot.txt file for redball.com

User-agent: *
Disallow:

The above would let all spiders index all content.

Here another

User-agent: *
Disallow: /cgi-bin/

The above would block all spiders from indexing the cgi-bin directory.

User-agent: googlebot
Disallow:

User-agent: *
Disallow: /admin.php
Disallow: /cgi-bin/
Disallow: /admin/
Disallow: /stats/

In the above example googlebot can index everything while all other spiders can not index admin.php, cgi-bin, adminFind Article, and stats directory. Notice that you can block single files like admin.php.

Article Tags: Robottxt File

Source: Free Articles from ArticlesFactory.com

ABOUT THE AUTHOR


Jimmy Whisenhunt is the owner of VIP Enterprises



Health
Business
Finance
Travel
Home Repair
Technology
Computers
Family
Communication
Entertainment
Autos
Marketing
Self Help
Sports
Home Business
Education
ECommerce
Law
Other
Internet
Partners


Page loaded in 0.058 seconds