URL Parameter Handling In GWT to Treat Overindexation - how aggressive?

LukeyJamo

Hi,

My client recently launched a new site and their index went from about 20K up to about 80K - which is a severe over indexation.

I believe this was caused by parameter handling as some category pages now have 700 pages in the results for "site:domain.com/category1" - and apart from the top result, they are all parameters being indexed.

My question is how active/aggressive should I be in blocking these parameters in Google Webmaster Tools? Currently, everything is set to 'let googlebot decide'.

KeriMorgret

Hi! Did these answers take care of your question, or do you still have some questions?

evolvingSEO

Hey There

I would use a robots meta noindex on them (except for the top page of course) and use rel = prev/next to show they are paginated.

I would prefer to do that than use WMT. Also, WMT crawl settings will stop the crawling, but not remove them from the index. Plus, WMT will only handle Google, not other engines like Bing etc. Not that Bing matters, but always better to have a universal solution.

-Dan

vmialik

Hello Search Guys,

Here is some food for thought taken from: http://www.quora.com/Does-Google-limit-the-number-of-pages-it-indexes-for-a-particular-site

Summary:

"Google says they crawl the web in "roughly decreasing PageRank order" and thus, pages that have not achieved widespread link popularity, particularly on large, deep sites, may not be crawled or indexed."

"Indexation
There is no limit to the number of pages Google may index (meaning available to be served in search results) for a site. But just because your site is crawled doesn't mean it will be indexed.

Crawl
The ability, speed and depth for which Google crawls your site and retrieves pages can be dependent on a number of factors: PageRank, XML sitemaps, robots.txt, site architecture, status codes and speed."

"For a zero-backlink domain with 80.000+ pages, in conjunction with rel=canonical and an xml-sitemap (You do submit a sitemap, don't you?), after submitting the domain to Google for a crawl, a little less than 10k pages remained in index. A few crawls later this was reduced to a mere 250 (very good job on Google's side).

This leads me to believe the indexation cap for a newer site with low to zero pagerank/authority is around 10k."

Another interesting article: http://searchenginewatch.com/article/2062851/Google-Upping-101K-Page-Index-Limit

Hope this helps, and easy response is to limit crawling to the most needed pages as aggressive as possible to remove the unneeded links leaving only needed ones

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

URL Parameter Handling In GWT to Treat Overindexation - how aggressive?

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Competing URLs

Any way to force a URL out of Google index?

How to optimize an ecommerce catalog that uses parameters only

Nofollow "print" URLs?

URL Construction

URL Parameter & crawl stats

Automatic redirect to external urls

Migrating a site with new URL structure