Old pages still in index
-
Hi Guys,
I've been working on a E-commerce site for a while now. Let me sum it up :
- February new site is launched
- Due to lack of resources we started 301's of old url's in March
- Added rel=canonical end of May because of huge index numbers (developers forgot!!)
- Added noindex and robots.txt on at least 1000 urls.
- Index numbers went down from 105.000 tot 55.000 for now, see screenshot (actual number in sitemap is 13.000)
Now when i do site:domain.com there are still old url's in the index while there is a 301 on the url since March!
I know this can take a while but I wonder how I can speed this up or am doing something wrong. Hope anyone can help because I simply don't know how the old url's can still be in the index.
-
Hi Dan,
Thanks for the answer!
Indexation is already back to 42.000 so slowly going back to normal
And thanks for the last tip, that's totally right. I just discovered that several pages had duplicate url's generated so by continually monitoring we'll fix it !
-
Hi There
To noindex pages there are a few methods;
-
use a meta noindex without robots.txt - I think that is why some may not be removed. The robots.txt block crawling so they can not see the noindex.
-
use a 301 redirect - this will eventually kill off the old pages, but it can definitely take a while.
-
canonical it to another page. and as Chris says, don't block the page or add extra directives. If you canonical the page (correctly), I find it usually drops out of the index fairly quickly after being crawled.
-
use the URL removal tool in webmaster tools + robots.txt or 404. So if you 404 a page or block it with robots.txt you can then go into webmaster tools and do a URL removal. This is NOT recommended though in most normal cases, as Google prefers this be for "emergencies".
The only method that removes pages within a day or two guaranteed is the URL removal tool.
I would also examine your site since it is new, for something that is causing additional pages to be generated and indexed. I see this a lot with ecommerce sites where they have lots of pagination, facets, sorting, etc and those can generate lots of other pages which get indexed.
Again, as Chris says, you want to be careful to not mix signals. Hope this all helps!
-Dan
-
-
Hi Chris,
Thanks for your answer.
I'm either using a 301 or noindex, not both of course.
Still have to check the server logs, thanks for that!
Another weird thing. While the old url is still in the index, when i check the cache date it's a week old. That's what i don't get. Cache date is a week old but Google still has the old url in the index.
-
It can take months for pages to fall out of Google's index have you looked at your log files to verify that googlebot is crawling those pages?. Things to keep in mind:
- If you 301 a page, the rel=canonical on that page will not be seen by the bot (no biggie in your case)
- If you 301 a page, a meta noindex will not be seen by the bot
- It is suggested not to use the robots.txt to no index a page that is being 301 redirected--as the redirect may not be seen by Google.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Pillar pages and blog pages
Hello, I was watching this video about pillar pages https://www.youtube.com/watch?v=Db3TpDZf_to and tried to apply it to my self but find it impossible to do (but maybe I am looking at it the wrong way). Let's say I want to rank on "Normandy bike tou"r. I created a pillar page about "Normandy bike tour" what would be the topics of the subpages boosting that pillar page. I know that it should be questions people have but in the tourism industry they don't have any, they just want us to make them dream !! I though about doing more general blog pages about things such as : Places to rent a bike in Normandy or in XYZ city ? ( related to biking) Or the landing sites in Normandy ? (not related to biking) Is it the way to do it, what do you recommend ? Thank you,
Intermediate & Advanced SEO | | seoanalytics0 -
Do uncrawled but indexed pages affect seo?
It's a well known fact that too much thin content can hurt your SEO, but what about when you disallow google to crawl some places and it indexes some of them anyways (No title, no description, just the link) I am building a shopify store and it's imposible to change the robots.txt using shopify, and they disallow for example, the cart. Disallow: /cart But all my pages are linking there, so google has the uncrawled cart in it's index, along with many other uncrawled urls, can this hurt my SEO or trying to remove that from their index is just a waste of time? -I can't change anything from the robots.txt -I could try to nofollow those internal links What do you think?
Intermediate & Advanced SEO | | cuarto7150 -
Many pages small unique content vs 1 page with big content
Dear all, I am redesigning some areas of our website, eurasmus.com and we do not have clear what is the best
Intermediate & Advanced SEO | | Eurasmus.com
option to follow. In our site, we have a city area i.e: www.eurasmus.com/en/erasmus-sevilla which we are going
to redesign and a guide area where we explain about the city, etc...http://eurasmus.com/en/erasmus-sevilla/guide/
all with unique content. The thing is that at this point due to lack of resources, our guide is not really deep and we believe like this it does not
add extra value for users creating a page with 500 characters text for every area (transport...). It is not also really user friendly.
On the other hand, this pages, in long tail are getting some results though is not our keyword target (i.e. transport in sevilla)
our keyword target would be (erasmus sevilla). When redesigning the city, we have to choose between:
a)www.eurasmus.com/en/erasmus-sevilla -> with all the content one one page about 2500 characters unique.
b)www.eurasmus.com/en/erasmus-sevilla -> With better amount of content and a nice redesign but keeping
the guide pages. What would you choose? Let me know what you think. Thanks!0 -
Product with two common names: A separate page for each name, or both on one page?
This is a real-life problem on my ecommerce store for the drying rack we manufacture: Some people call it a Clothes Drying Rack, while others call it a Laundry Drying Rack, but it's really the same thing. Search volume is higher for the clothes version, so give it the most attention. I currently have 2 separate pages with the On-Page optimization focused on each name (URL, Title, h1, img alts, etc) Here the two drying rack pages: clothes focused page and laundry focused page But the ranking of both pages is terrible. The fairly generic homepage shows up instead of the individual pages in Google searches for the clothes drying rack and for laundry drying rack. But I can get the individual page to appear in a long-tail search like this: round wooden clothes drying rack So my thought is maybe I should just combine both of these pages into one page that will hopefully be more powerful. We would have to set up the On-Page optimization to cover both "clothes & laundry drying rack" but that seems possible. Please share your thoughts. Is this a good idea or a bad idea? Is there another solution? Thanks for your help! Greg
Intermediate & Advanced SEO | | GregB1230 -
Index or not index Categories
We are using Yoast Seo plugin. On the main menu we have only categories which has consist of posts and one page. We have category with villas, category with villa hotels etc. Initially we set to index and include in the sitemap posts and excluded categories, but I guess it was not correct. Would be a better way to index and include categories in the sitemap and exclude the posts in order to avoid the duplicate? It somehow does not make sense for me, If the posts are excluded and the categories included, will not then be the categories empty for google? I guess I will get crazy of this. Somebody has perhaps more experiences with this?
Intermediate & Advanced SEO | | Rebeca10 -
Huge google index with un-relevant pages
Hi, i run a site about sport matches, every match has a page and the pages are generated automatically from the DB. pages are not duplicated, but over time some look a little bit similar. after a match finishes it has no internal links or sitemap entry, but it's reachable by direct URL and continues to be on google index. so over time we have more than 100,000 indexed pages. since past matches have no significance and they're not linked and a match can repeat and it may look like duplicate content....what you suggest us to do: when a match is finished - not linked, but appears on the index and SERP 301 redirect the match Page to the match Category which is a higher hierarchy and is always relevant? use rel=canonical to the match Category do nothing.... *301 redirect will shrink my index status, some say a high index status is good... *is it safe to 301 redirect 100,000 pages at once - wouldn't it look strange to google? *would canonical remove the past matches pages from the index? what do you think? Thanks, Assaf.
Intermediate & Advanced SEO | | stassaf0 -
Webmaster Index Page significant drop
Has anyone noticed a significant drop in indexed pages within their Google Webmaster Tools sitemap area? We went from 1300 to 83 from Friday June 23 to today June 25, 2012 and no errors are showing or warnings. Please let me know if anyone else is experiencing this and suggestions to fix this?
Intermediate & Advanced SEO | | datadirect0 -
Category Pages up - Product Pages down... what would help?
Hi I mentioned yesterday how one of our sites was losing rank on product pages. What steps do you take to improve the SERPS of product pages, in this case home/category/product is the tree. There isn't really any internal linking, except one link from the category page to each product, would setting up a host of internal links perhaps "similar products" linking them together be a place to start? How can I improve my ranking of these more deeply internal pages? Not just internal links?
Intermediate & Advanced SEO | | xoffie0