Recovering from index problem
-
Hi all. For a while, we've been working on http://thewilddeckcompany.co.uk/. Everything was going swimmingly, and we had a top 5 ranking for the term 'bird hides' for this page - http://thewilddeckcompany.co.uk/products/bird-hides.
Then disaster struck! The client added a link with a faulty parameter in the Joomla back end that caused a bunch of duplicate content issues. Before this happened, all the site's 19 pages were indexed. Now it's just a handful, including the faulty URL (<cite>thewilddeckcompany.co.uk/index.php?id=13</cite>)
This shows the issue pretty clearly.
I've removed the link, redirected the bad URL, updated the site map and got some new links pointing at the site to resolve the problem. Yet almost two month later, the bad URL is still showing in the SERPs and the indexing problem is still there.
Any ideas? I'm stumped!
-
Hi
You can use the URL removal, which should work fine, but I think this issue was you were blocking crawling in robots.txt for /?id=13 http://thewilddeckcompany.co.uk/robots.txt and www http://www.thewilddeckcompany.co.uk/robots.txt
If the crawler is blocked it can not see the redirect, so you would want to remove the /?id=13 from robots.txt and let Google crawl it to see the redirect, and that URL will drop out over time.
Something unrelated I just noticed - you should have a better 404 template http://thewilddeckcompany.co.uk/kjhf - one that uses your site's template/design and provides something a little more useful.
-Dan
-
I had no idea you could request a removal like that. Thank you!
-
Blink,
It sounds like you've done all the right things but if you wanted to, you can request removal of a cached page via google webmaster tools--it might help. Otherwise, just keep waiting, It'll go away. Have you looked at your server logs to see if googlebot's crawled those pages recently?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Big problems with site traffic
Hello! I have big problems with website promotion. It's been 7 months and the attendance on the site is 1-5 people a day. I do not understand the reason. Can you tell me what I'm doing wrong? Site: www.azartlist.com Many thanks.
Intermediate & Advanced SEO | | Bobic1 -
Only 4 of my pages have been indexed out of 64 in total
Hi there, I submitted a sitemap for a new 64 page website 6 weeks ago and only a few pages have been indexed. The website shows in Google search but with a large amount of information on the website it should show higher. I have fetched and rendered 30 plus pages on the 9th September and others on the 16th September, today is the 5th October but in Webmaster tools, Google only acknowledge 1 page as indexed. I have checked the robots txt file which shows it is allowed. There are no messages for problems with crawl and no errors showing. The domain is www.urbaneforms.com . Can you offer a suggestion as to why we are not being indexed?
Intermediate & Advanced SEO | | simplyworld0 -
Recovering from index problem (Take two)
Hi all. This is my second pass at the problem. Thank you for your responses before, I think I'm narrowing it down! Below is my original message. Afterwards, I've added some update info. For a while, we've been working on http://thewilddeckcompany.co.uk/. Everything was going swimmingly, and we had a top 5 ranking for the term 'bird hides' for this page - http://thewilddeckcompany.co.uk/products/bird-hides. Then disaster struck! The client added a link with a faulty parameter in the Joomla back end that caused a bunch of duplicate content issues. Before this happened, all the site's 19 pages were indexed. Now it's just a handful, including the faulty URL (thewilddeckcompany.co.uk/index.php?id=13) This shows the issue pretty clearly. https://www.google.co.uk/search?q=site%3Athewilddeckcompany.co.uk&oq=site%3Athewilddeckcompany.co.uk&aqs=chrome..69i57j69i58.2178j0&sourceid=chrome&ie=UTF-8 I've removed the link, redirected the bad URL, updated the site map and got some new links pointing at the site to resolve the problem. Yet almost two month later, the bad URL is still showing in the SERPs and the indexing problem is still there. UPDATE OK, since then I've blocked the faulty parameter in the robots.txt file. Now that page has disappeared, but the right one - http://thewilddeckcompany.co.uk/products/bird-hides - has not been indexed. It's been like this for several week. Any ideas would be much appreciated!
Intermediate & Advanced SEO | | Blink-SEO0 -
How is Google crawling and indexing this directory listing?
We have three Directory Listing pages that are being indexed by Google: http://www.ccisolutions.com/StoreFront/jsp/ http://www.ccisolutions.com/StoreFront/jsp/html/ http://www.ccisolutions.com/StoreFront/jsp/pdf/ How and why is Googlebot crawling and indexing these pages? Nothing else links to them (although the /jsp.html/ and /jsp/pdf/ both link back to /jsp/). They aren't disallowed in our robots.txt file and I understand that this could be why. If we add them to our robots.txt file and disallow, will this prevent Googlebot from crawling and indexing those Directory Listing pages without prohibiting them from crawling and indexing the content that resides there which is used to populate pages on our site? Having these pages indexed in Google is causing a myriad of issues, not the least of which is duplicate content. For example, this file <tt>CCI-SALES-STAFF.HTML</tt> (which appears on this Directory Listing referenced above - http://www.ccisolutions.com/StoreFront/jsp/html/) clicks through to this Web page: http://www.ccisolutions.com/StoreFront/jsp/html/CCI-SALES-STAFF.HTML This page is indexed in Google and we don't want it to be. But so is the actual page where we intended the content contained in that file to display: http://www.ccisolutions.com/StoreFront/category/meet-our-sales-staff As you can see, this results in duplicate content problems. Is there a way to disallow Googlebot from crawling that Directory Listing page, and, provided that we have this URL in our sitemap: http://www.ccisolutions.com/StoreFront/category/meet-our-sales-staff, solve the duplicate content issue as a result? For example: Disallow: /StoreFront/jsp/ Disallow: /StoreFront/jsp/html/ Disallow: /StoreFront/jsp/pdf/ Can we do this without risking blocking Googlebot from content we do want crawled and indexed? Many thanks in advance for any and all help on this one!
Intermediate & Advanced SEO | | danatanseo0 -
How to find all indexed pages in Google?
Hi, We have an ecommerce site with around 4000 real pages. But our index count is at 47,000 pages in Google Webmaster Tools. How can I get a list of all pages indexed of our domain? trying to locate the duplicate content. Doing a "site:www.mydomain.com" only returns up to 676 results... Any ideas? Thanks, Ben
Intermediate & Advanced SEO | | bjs20100 -
Page Indexed but not Cached
A section of pages on my site are indexed (I know because they appear in SERPs if I copy and paste a sentence from the content), however according to the text-only cached version of the page they are not being read by Google.Why are they indexed event hough it seems like Google is not reading them..... or is Google in fact reading this text even though it seems like they should not be?Thanks for your assistance.
Intermediate & Advanced SEO | | theLotter0 -
Why are new pages not being indexed, and old pages (now in robots.txt) remain in the index?
I currently have a site that was recently restructured, causing much of its content to be reposted, creating new URL's for each page. To avoid duplicates, all of the existing pages were added to the robots file. That said, it has now been over a week - I know Google has recrawled the site - and when I search for term X, it is stil the old page that is ranking, with the new one nowhere to be seen. I'm assuming it's a cached version, but why are so many of the old pages still appearing in the index? Furthermore, all "tags" pages (it's a Q&A site, like this one) were also added to the robots a few months ago, yet I think they are all still appearing in the index. Anyone got any ideas about why this is happening, and how I can get my new pages indexed?
Intermediate & Advanced SEO | | corp08030 -
Should I index tag pages?
Should I exclude the tag pages? Or should I go ahead and keep them indexed? Is there a general opinion on this topic?
Intermediate & Advanced SEO | | NikkiGaul0