How to Identify Google Crawl Requests
One of the best ways to identify Google crawl requests is to use the free tools at Google's Webmaster Central site. Googlebot is the virtual spider that goes out into the Web, gathers website pages and passes them off to the Google indexer. At Webmaster Central, you can access Googlebot statistics, set crawler access, determine crawl errors and summon the Googlebot at will to test your website.
Instructions
-
Add Your Site
-
1
Navigate to Webmaster Central at google.com/webmasters.
-
2
Click the "Sign in to Webmaster Tools" button. If you are not already signed into a Google account, you will have to provide your log-in credentials. Otherwise, you are taken to the Webmaster Tools Home.
-
-
3
Click the "Add a site" button. Type the website URL in the small pop-up window and press "Continue."
-
4
Follow the instructions to verify website ownership. The method that Google recommends is uploading a small HTML file to your hosting server. However, three other methods are available, including adding a special meta tag to your site's HTML file, linking your Google Analytics account or adding a DNS record to your domain's configuration.
-
5
Click the "Verify" button once you have chosen your preferred method.
Tracking Googlebot Crawls
-
6
Sign into the Google Webmaster Tools site and click the link with your site's URL.
-
7
Go to the left sidebar and click the "+" icon for the "Diagnostics" label. A menu expands on the page. Click the "Crawl stats" option to view the Googlebot details.
-
8
Click the "Crawl errors" in the left sidebar to see if the Googlebot has discovered any potential errors on your website. A common error is the notorious "404 error," which means a page was not found at its URL destination.
-
9
Click the "Fetch as Googlebot" link from the sidebar. Type the URL you want Google to crawl and then hit the "Fetch" button. It may take a few minutes, but the results will display in the panel below.
-
10
Go to the sidebar and click the "+" icon for the "Site Configuration" label. Select the "Crawler access" option. On this page, you can edit the different restrictions, requirements and parameters that the Googlebot must follow when crawling your website for content.
-
1