URL Parameter Handling In GWT to Treat Overindexation - how aggressive?
-
Hi,
My client recently launched a new site and their index went from about 20K up to about 80K - which is a severe over indexation.
I believe this was caused by parameter handling as some category pages now have 700 pages in the results for "site:domain.com/category1" - and apart from the top result, they are all parameters being indexed.
My question is how active/aggressive should I be in blocking these parameters in Google Webmaster Tools? Currently, everything is set to 'let googlebot decide'.
-
Hi! Did these answers take care of your question, or do you still have some questions?
-
Hey There
I would use a robots meta noindex on them (except for the top page of course) and use rel = prev/next to show they are paginated.
I would prefer to do that than use WMT. Also, WMT crawl settings will stop the crawling, but not remove them from the index. Plus, WMT will only handle Google, not other engines like Bing etc. Not that Bing matters, but always better to have a universal solution.
-Dan
-
Hello Search Guys,
Here is some food for thought taken from: http://www.quora.com/Does-Google-limit-the-number-of-pages-it-indexes-for-a-particular-site
Summary:
"Google says they crawl the web in "roughly decreasing PageRank order" and thus, pages that have not achieved widespread link popularity, particularly on large, deep sites, may not be crawled or indexed."
"Indexation
There is no limit to the number of pages Google may index (meaning available to be served in search results) for a site. But just because your site is crawled doesn't mean it will be indexed.Crawl
The ability, speed and depth for which Google crawls your site and retrieves pages can be dependent on a number of factors: PageRank, XML sitemaps, robots.txt, site architecture, status codes and speed.""For a zero-backlink domain with 80.000+ pages, in conjunction with rel=canonical and an xml-sitemap (You do submit a sitemap, don't you?), after submitting the domain to Google for a crawl, a little less than 10k pages remained in index. A few crawls later this was reduced to a mere 250 (very good job on Google's side).
This leads me to believe the indexation cap for a newer site with low to zero pagerank/authority is around 10k."
Another interesting article: http://searchenginewatch.com/article/2062851/Google-Upping-101K-Page-Index-Limit
Hope this helps, and easy response is to limit crawling to the most needed pages as aggressive as possible to remove the unneeded links leaving only needed ones
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Which URL should I choose when combining content?
I am combining content from two similar articles into one. URL 1 has a featured snippet and better URL structure, but only 5,000 page views in the last 6 month, and has 39 keywords ranking in the top 10. URL 2 has worse structure, but over 100k page views in the last 6 months, and 236 keywords in the top 10. Basically, I'm wondering if I keep the one with the better URL structure or the one with more traffic. The deleted URL will be redirected to whichever I keep.
Intermediate & Advanced SEO | | curtis-yakketyyak0 -
Site-wide Canonical Rewrite Rule for Multiple Currency URL Parameters?
Hi Guys, I am currently working with an eCommerce site which has site-wide duplicate content caused by currency URL parameter variations. Example: https://www.marcb.com/ https://www.marcb.com/?setCurrencyId=3 https://www.marcb.com/?setCurrencyId=2 https://www.marcb.com/?setCurrencyId=1 My initial thought is to create a bunch of canonical tags which will pass on link equity to the core URL version. However I was wondering if there was a rule which could be implemented within the .htaccess file that will make the canonical site-wide without being so labour intensive. I also noticed that these URLs are being indexed in Google, so would it be worth setting a site-wide noindex to these variations also? Thanks
Intermediate & Advanced SEO | | NickG-1230 -
URL Structure For E-commerce Sites
Hi Guys, I was wondering what would be the optimal and best URL structure for sub-categories on a E-commerce site for SEO purposes. Example if my category was dresses and I had multiple sub-categories within dresses would 1 or 2 below be the better URL structure? 1) Domain + Category + Sub-Category be the most suitable URL structure: Sleeveless Dresses URL: clothingstore.com/dresses/sleeveless-dresses Midi Dresses URL: clothingstore.com/dresses/midi-dresses 2) OR would excluding the category be better Domain + Sub-Category like: Sleeveless Dresses URL: clothingstore.com/sleeveless-dresses Midi Dresses URL: clothingstore.com/midi-dresses Do you think it makes much of a difference, is shorter better and more effective in this case? E.g. Rand discuses in this article: https://moz.com/blog/15-seo-best-practices-for-structuring-urls that having the keyword in the URL serves as anchor text, so wouldn't having additional keywords dilute value in this case? Plus he mentions shorter URLs the better. Cheers, Chris
Intermediate & Advanced SEO | | jayoliverwright1 -
Ecommerce URL's
I'm a bit divided about the URL structure for ecommerce sites. I'm using Magento and I have Canonical URLs plugin installed. My question is about the URL structure and length. 1st Way: If I set up Product to have categories in the URL it will appear like this mysite.com/category/subcategory/product/ - and while the product can be in multiple places , the Canonical URL can be either short or long. The advantage of having this URL is that it shows all the categories in the breadcrumbs ( and a whole lot more links over the site ) . The disadvantage is the URL Length 2nd Way: Setting up the product to have no category in the URL URL will be mysite.com/product/ Advantage: short URL. disadvantage - doesn't show the categories in the breadcrumbs if you link direct. Thoughts?
Intermediate & Advanced SEO | | s_EOgi_Bear1 -
Does a sitemap override Google parameter handling?
This question might seem silly, but I'll ask anyway. We have an eCommerce site with a ton of duplicate content, mostly caused by faceted navigation. In researching ways to reduce the clutter, I've decided to use Google parameter handling to stop Googlebot from crawling pages with certain parameters, like: sort order, page #, etc... Now my question: If I set all of these parameters so that Googlebot doesn't crawl the grids, how will they ever find the individual product pages? We do upload a sitemap with all of the product pages. Does this solve my issue? Or, should I handle the duplicate content with noindex, follow tag? Or, is there an even better way? Thanks
Intermediate & Advanced SEO | | rhoadesjohn0 -
Sub Domains vs. Persistent URLs
I've always been under the assumption that when building a micro-site it was better to use a true path (e.g. yourcompany.com/microsite) URL as opposed to a sub domain (microsite.yourcompany.com) from an SEO perspective. Can you still generate significant SEO gains from a sub domain if you were forced to use it providing the primary (e.g. yourcompany.com) had a lot of link clout/authority? Meaning, if I had to go the sub domain route would it be the end of the world?
Intermediate & Advanced SEO | | VERBInteractive0 -
Automatic redirect to external urls
Hi, there is a way to create a "bridge page" with automatic url redirect ( 302 ) without google penalization? In this moment, my bridge pages are indexed on google with title and description of the redirected page.. Thanks in advance. Mauro.
Intermediate & Advanced SEO | | raulo790 -
Page URL Issue
Hey Friend, I am having sort of a problem. I currently have a subpage with the url of: /musclecars/ I also have a subpage at /muscle-cars/muscle-car-restoration.html Obviously my main url is not listed here. My problem is I am trying to rank for the term Muscle Cars but the first URL does not have the keywords seperated so I rank no where. If I type MuscleCars into google I rank though (but nobody types the keyword in like that). So my question is can I create muscle-cars.mydomainname.com and rank well with that? Or is it better to just use mydomainname.com/muscle-cars/ even though that second term I am ranking for already has that in its url?
Intermediate & Advanced SEO | | shandaman0