{{search.results.count}} results {{searchFilter()}} {{searchQuery()}}
{{search.results.count}} results {{searchFilter()}} {{searchQuery()}}
{{ item.title }}
{{ item.short_summary }}
Enter a search term such as “mobile analytics” or browse our content using the filters above.

That’s not only a poor Scrabble score but we also couldn’t find any results matching
“”.
Check your spelling or try broadening your search.

This service is currently undergoing maintenance.
Please try again later.
Google's Webmaster Tools blog has just published a useful presentation, which provides advice on getting your pages crawled and indexed by the search engine.
Basically, the Googlebot can only crawl and index a small proportion of all the content online, so streamlining your site to reduce unnecessary crawling can optimise the speed and accuracy of your indexing.
Here are some of Google's tips, more detail in the full slideshow...
-
Remove user-specific details from URLs.
For faster crawling and indexing, removing details that are specific to user, such as session IDs, will reduce the number of URLs pointing to the same content and speed up indexing. -
Look out for infinite spaces
By 'infinite spaces' Google is referring to large numbers of links with little new content to index. This could be a calendar with links to future dates or an e-commerce website's filtering options which can produce thousands of unique URLs.All these extra links mean that the Googlebot is wasting its time trying to crawl all these URLs. Google has some suggested fixes, such as adding the nofollow attribute to such links.
-
Disallow actions that Googlebot can't perform
Google cannot login to pages or crawl contact forms, so using the robots.txt file to disallow this will save wasted time attempting to crawl such pages. -
Watch out for duplicate content
According to Google. the closer you can get to one unique URL for each unique piece of content, the more streamlined your website will be for indexing and crawling.This is not always possible, so indicating the preferred URL by using the rel=canonical element, as described in this video from Matt Cutts, will solve this problem.
Comments (1)
Clayton Leis
Wouldn't disallowing all of these pages cause some of your internal link juice to disappear into these "black holes"? For example, say every page on your website linked to the 'contact us' page. If you disallowed that page, then every page on your site is passing link juice to the contact us page, but that link juice isn't being passed back through the links on that contact page.
Sure, it may not be a gigantic difference, but if your not having indexing issues, why bother?
over 8 years ago