How to Create a Web Robot

How to Create a Web Robot thumbnail
Spiders look for keywords like you might with a magnifying glass and a pen.

Although it sounds far-fetched, blocking search engine spiders with robots is actually what a robot.txt file does. Search engines use spiders (or robots, or bots) to crawl or index your website, searching for keywords to use to bring up your website in a search. A robot.txt file is a file you can easily create to let the spider know that you don't want it to crawl on your page, or part of your page.

Things You'll Need

  • Text editor (e.g. Notepad)
Show More

Instructions

    • 1

      Open your favorite text editor. It doesn't matter what text editor you use. Notepad works just fine if you're on a PC, and can be found under "Accessories."

    • 2

      Enter two lines, one for the name of the spider that will be crawling your web page, and one for the directory or file name you want to exclude for the search. This is the syntax:

      User-Agent: [Spider or Bot name]
      Disallow: [Directory or File Name]

      For example:

      User-Agent: Googlebot
      Disallow: /mywebsite/private.html

      where "Googlebot" is the robot sent out by Google, and "private.html" is the file in the directory "mywebsite" that you do not want the robot to index.

    • 3

      Exclude a section of your site from all spiders. If you do not want any robots to index a certain section of your site, use the "*" character after User-Agent. Your file would look like this:

      User-Agent: *
      Disallow: /mywebsite/private.html

    • 4

      Exclude your whole site from all robots. If you don't want any of your site to be visible by robots, (e.g. if you are building your website, and it is not ready to be viewed by the public), insert a "*" character after User-Agent, and the "/" after Disallow. For example:

      User-Agent: *
      Disallow: /

    • 5

      If you want to allow all robots to access your whole site, simply add the asterisk as before, and leave the Disallow section empty, as follows:

      User-Agent: *
      Disallow:

    • 6

      Save the file as robot.txt, and place it in the root directory of your website. For example, http://www.mywebsite.com/robots.txt.

Tips & Warnings

  • This technique is not a security measure. Pages that are not indexed can still be accessed. There are hundreds of bots out there, some of which will not respect your wishes, and will search the restricted sections of your sites anyway. Still others are designed to search only those restricted sections.

  • If you restrict your entire site while it is under construction, remember to lift that restriction when your site is ready for viewing so that it can be indexed.

Related Searches:

References

Resources

  • Photo Credit a magnifying glass pen writing image by davidphotos from Fotolia.com

Comments

You May Also Like

  • How to Create My Own Website Robot

    Website robots are text files that notify search engines of the files within your web site that you do not want crawled....

  • How to Build a Web Search Robot

    Building a web search robot takes a great amount of studying and development time. It requires that you learn how to program...

  • How to Make Robots

    Robots have fascinated our imaginations for decades. It's possible to create a robot of your own, called the "bristlebot". You can build...

  • How to Make a Robot Craft

    If you have a young child who's interested in robots, one way to further this interest is to make a robot craft...

  • How to Make a Robot for Kids With Home Materials

    You don't have to have metal parts to create an awesome robot for kids. Use common items you already have on hand...

  • How to Build Lego Robots

    A childhood toy from the past has been revamped for the frontier edge of the future. Where traditional Legos have inspired children...

  • How to Make a Toy Robot

    Toy robots were once little wind up clunkers manufactured by the USA and Germany. By the 1950s, however, Japan built toy robots...

  • How to Make a Chat Robot

    Artificial intelligence is a fascinating field, centered around the ability to create a robot or machine capable of interacting like a human...

  • How to Create a Robot

    Robots are exciting, complex machines born from the imagination of famed inventor Leonardo da Vinci, and being perfected on a daily basis...

  • How to Transfer New E-mail From One Account to Another

    When replying to an e-mail message that used your alias address, your regular e-mail address will show up in the e-mail message...

Related Ads

Featured