Problems with to many indexed pages
-
A client of our have not been able to rank very well the last few years. They are a big brand in our country, have more than 100+ offline stores and have plenty of inbound links.
Our main issue has been that they have to many indexed pages. Before we started we they had around 750.000 pages in the Google index. After a bit of work we got it down to 400-450.000. During our latest push we used the robots meta tag with "noindex, nofollow" on all pages we wanted to get out of the index, along with canonical to correct URL - nothing was done to robots.txt to block the crawlers from entering the pages we wanted out.
Our aim is to get it down to roughly 5000+ pages. They just passed 5000 products + 100 categories.
I added this about 10 days ago, but nothing has happened yet. Is there anything I can to do speed up the process of getting all the pages out of index?
The page is vita.no if you want to have a look!
-
Great! Please let us know how it goes so we can all learn more about it.
Thanks!
-
Thanks for that! What you are saying makes sense, so I'm going to go ahead and give it a try.
-
"Google: Do Not No Index Pages With Rel Canonical Tags"
https://www.seroundtable.com/noindex-canonical-google-18274.htmlThis is still being debated by people and I'm not saying it is "definitely" your problem. But if you're trying to figure out why those noindexed pages aren't coming out of the index this could be one thing to look into.
John Mueller (see screenshot below) is a Webmaster Trends Analyst for Google.
Good luck.
-
Isn't the whole point of using canonical to give Google a pointer of what page it is originally meant to be?
So if you have a category on shop.com/sub..
Using filter and/or pagenation you then get:
shop.com/sub?p=1
shop.com/sub?color=blue.. and so on! Both those pages then need canonical and neither do we want them index, so we by using both canonical and noindex tell Google to "don't index this page (noindex), here is the original version of it (canonical)".
Or did I misunderstand something?
-
Hello Inevo,
Most of the time when this happens it's just because Google hasn't gotten around to recrawling the pages and updating their index after seeing the new robots meta tag. It can take several months for this to happen on a large site. Submit an XML sitemap and/or create an HTML sitemap that makes it easy for them to get to these pages if you need it to go faster.
I had a look and see some conflicting instructions that Google could possibly be having a problem with.
The paginated version ( e.g. http://www.vita.no/duft?p=2 ) of the page has a rel canonical tag pointing to the first page (e.g. http://www.vita.no/duft/ ). Yet it also has a noindex tag while the canonical page has an index tag. And each page has its own unique title (Side 2 ... Side 3 | ...) . I would remove the rel canonical tag on the paginated pages since they probably don't have any pagerank worth giving to the canonical page. This way it is even more clear to Google that the canonical page is to be indexed, and the others are not to be - instead of saying they are the same page. The same is true of filter pages: http://www.vita.no/gavesett/herre/filter/price-400-/ .
I don't know if that has anything to do with your issue of index bloat, but it's worth a try. I did find some paginated pages in the index.
There also appears to be about 520 blog tag pages indexed. I typically set those to be noindex,follow.
Also remove all paginated pages and any other page that you don't want indexed from your XML sitemaps if you haven't already.
At least for the filter pages, since /filter/ is its own directory, you can use the URL removal tool in GWT. It does have a directory-level removal feature. Of course there are only 75 of these indexed at this moment.
-
My advice would be to include a fresh sitemap and upload it Google Webmaster tool. Not sure about time but I will second Donna, this will take time for the pages to get out of the Google Index.
There is one hack that I used for one page on my website but not sure if it will work for 1000+ pages.
I actually removed a page on my website using Google’s temporary removal request. It kicked the page out of the index for 90 days and in the mean time I added the link in the robots.txt file so it gone quickly and never returned back in the Google listing.
Hope this helps.
-
Hi lnevo,
I had a similar situation last year and am not aware of a faster way to get pages deindexed. You're feeding WMT an updated sitemap right?
It took 8 months for the excess pages to get dropped off my client's site. I'll be listening to hear if anyone knows a faster way.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Should I "no-index" two exact pages on Google results?
Hello everyone, I recently started a new wordpress website and created a static homepage. I noticed that on Google search results, there are two different URLs landing on same content page. I've attached an image to explain what I saw. Should I "no-index" the page url? Google url.JPG In this picture, the first result is the homepage and I try to rank for that page. The last result is landing on same content with different URL. So, should I no-index last result as shown in image?
Technical SEO | | amanda59640 -
Keywords are indexed on the home page
Hello everyone, For one of our websites, we have optimized for many keywords. However, it seems that every keyword is indexed on the home page, and thus not ranked properly. This occurs only on one of our many websites. I am wondering if anyone knows the cause of this issue, and how to solve it. Thank you.
Technical SEO | | Ginovdw1 -
Google Webmaster tools Sitemap submitted vs indexed vs Index Status
I'm having an odd error I'm trying to diagnose. Our Index Status is growing and is now up to 1,115. However when I look at Sitemaps we have 763 submitted but only 134 indexed. The submitted and indexed were virtually the same around 750 until 15 days ago when the indexed dipped dramatically. Additionally when I look under HTML improvements I only find 3 duplicate pages, and I ran screaming frog on the site and got similar results, low duplicates. Our actual content should be around 950 pages counting all the category pages. What's going on here?
Technical SEO | | K-WINTER0 -
Rel canonical for partner sites - product pages only or also homepage and other key pages?
Hello there Our main site is www.arenaflowers.com. We also run a number of partner sites (eg: http://flowershop.cancerresearchuk.org/). We've relcanonical'd the products on the partner site back to the main (arenaflowers.com) site. eg: http://flowershop.cancerresearchuk.org/flowers/tutti_frutti_es_2013 rel canonicals back to: http://www.arenaflowers.com/flowers/tutti_frutti_es_2013). My question: Should we also relcanonical the homepage and other key pages on partner sites back to the main arenaflowers website too? The content is similar but not identical. We don't want our partner sites to be outranking the original (as is the case on kw flower delivery for example). (NB this situation may be complicated by the fact we appear to have an unnatural link penalty on af.com (and when we did an upgrade a while back, the af.com site fell out of the index altogether due to some issues with our move to AWS.) We're getting professional SEO advice on this but wondered what the Moz community's thoughts were.. Cheers, Will
Technical SEO | | ArenaFlowers.com0 -
SEOMOZ and Webmaster Tools showing Different Page Index Results
I am promoting a jewelry e-commerce website. The website has about 600 pages and the SEOMOZ page index report shows this number. However, webmaster tools shows about 100,000 indexed pages. I have no idea why this is happening and I am sure this is hurting the page rankings in Google. Any ideas? Thanks, Guy
Technical SEO | | ciznerguy1 -
Pageing page and seo meta tag questions
Hi if i am using paging in my website there is lots of product in my website now in paging total paging is 1000 pages now what title tag i need to add for every paging page or is there any good way we can tell search engine all page or same ?
Technical SEO | | constructionhelpline0 -
Home page indexed but not ranking...interior pages with thin content outrank home page??
I have a Joomla site with a home page that I can't get to rank for anything beyond the company name @ Google - the site works fine @ Bing and Yahoo. The interior pages will rank all day long but the home page never shows up in the results. I have checked the page code out in every tool that I know about and have had no luck....by all account it should be good to go...any thoughts/comments/help would be greatly appreciated. The site is http://www.selectivedesigns.com Thanks! Greg
Technical SEO | | DougHosmer0 -
Site Navigation leads to "Too Many On-Page Links" warning
I run an ecommerce site with close to 2000 products. Nearly every page in the catalog has a too many on-page links error because of the navigation sidebar, which has several flyout layers of nested links. What can/should I do about this? Will it affect my rankings at all? Thanks
Technical SEO | | AmericanOutlets0