Recovering from index problem (Take two)
-
Hi all. This is my second pass at the problem. Thank you for your responses before, I think I'm narrowing it down!
Below is my original message. Afterwards, I've added some update info.
For a while, we've been working on http://thewilddeckcompany.co.uk/. Everything was going swimmingly, and we had a top 5 ranking for the term 'bird hides' for this page - http://thewilddeckcompany.co.uk/products/bird-hides.
Then disaster struck! The client added a link with a faulty parameter in the Joomla back end that caused a bunch of duplicate content issues. Before this happened, all the site's 19 pages were indexed. Now it's just a handful, including the faulty URL (thewilddeckcompany.co.uk/index.php?id=13)
This shows the issue pretty clearly.
I've removed the link, redirected the bad URL, updated the site map and got some new links pointing at the site to resolve the problem. Yet almost two month later, the bad URL is still showing in the SERPs and the indexing problem is still there.
UPDATE
OK, since then I've blocked the faulty parameter in the robots.txt file. Now that page has disappeared, but the right one - http://thewilddeckcompany.co.uk/products/bird-hides - has not been indexed. It's been like this for several week.
Any ideas would be much appreciated!
-
Thank you all, this is brilliant.
-
Your problem is with the robots.txt file. You are blocking the URL
thewilddeckcompany.co.uk/index.php?id=13
That URL 301 redirects to the correct URL of
http://thewilddeckcompany.co.uk/products/bird-hides
Google cannot "see" the 301 redirect from the old "bad" URLs to the new "good" URL.
You have to let Google crawl the old URLs and see the 301 redirects so that it knows how things need to forward.
I would do this for all the duplicate pages, make sure they 301 to the correct pages and do not put the "bad" pages in robots.txt - otherwise the indexing will not be updated.
Something separate to check. We have seen Google taking a while to acknowledge some of our 301s. Go into your GWT and look at your duplicate title reports. You may see the old and new URLs showing as duplicates, even with the 301s in place. We had to setup a self canonicalizing link on the "good" pages to help get that cleaned up.
-
Blink-SEO
Jonathan is correct to try a Fetch as Google in WMT for the urls you need re indexed. (Note, that is not really the purpose of a Fetch as Google, but sometimes it works.)
I would also resubmit the sitemap now that you have blocked the offending url with robots.txt. It is likely the resubmission will help you the quickest IMO.Best,
Robert
-
It sounds like you just need to wait for Google to recrawl your robots.txt file. I saw this error in the serps:
www.thewilddeckcompany.co.uk/products/timber-water...
A description for this result is not available because of this site's robots.txt – learn more.So it is clear that the robots.txt file has not updated with the changes, after the mistake was made. Try fetching as Googlebot within webmaster tools, but it may take a little time to update. But at least it would seem that the robots.txt error is still a cause of the problem, just need to wait a little longer.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How long should it take for indexed pages to update
Google has crawled and indexed my new site, but my old URLS appear in the search results. Is there a typical amount of time that it takes for Google to update the URL's displayed in search results?
Intermediate & Advanced SEO | | brianvest0 -
Of the two examples of markup (microdata, schema) code below, which of the two is better designed for its purpose of Q&A, and what might be suggested to improve upon these lines of code (context: questions and answers within article content.
ANSWER SEEN 'WITHIN THE QUESTION' BRACKET So you ask, why is the sky blue?
Intermediate & Advanced SEO | | RedFrog
Well, the answer is not so simple; In the day-time, when it's clear and cloudless,
the sky is blue because molecules in the air scatter blue light from the sun more than they scatter red light.
When we look towards the sun at sunset, we see red and orange colours because the blue light has been scattered out and away from the line of sight. See Structured Data Testing Results 'QUESTION' AND 'ANSWER' IN 2 SEPARATE BRACKETS Why Is The Sky Blue? Well, the answer is not so simple; In the day-time, when it's clear and cloudless,
the sky is blue because molecules in the air scatter blue light from the sun more than they scatter red light.
When we look towards the sun at sunset, we see red and orange colours because the blue light has been scattered out and away from the line of sight. See Structured Data Testing Results Thanks, Mark0 -
Index Problem
Hi guys I have a critical problem with google crawler. Its my website : https://1stquest.com I can't create sitemap with online site map creator tools such as XML-simemap.org Fetch as google tools usually mark as partial MOZ crawler test found both HTTP and HTTPS version on site! and google cant index several pages on site. Is problem regards to "unsafe URL"? or something else?
Intermediate & Advanced SEO | | Okesta0 -
What are Soft 404's and are they a problem
Hi, I have some old pages that were coming up in google WMT as a 404. These had links into them so i thought i'd do a 301 back to either the home page or to a relevant category or page. However these are now listed in WMT as soft 404's. I'm not sure what this means and whether google is saying it doesn't like this? Any advice welcomed.
Intermediate & Advanced SEO | | Aikijeff0 -
To remove from index or not and stop words
Each product is an item of jewellery based on a letter of the alphabet. At present all 26 are indexed but as you guess they all share the same description, title and URL (apart from change in letter). What I was going to do was set all but one to no-index, recreate new descriptions and revert back to index. But then that got me thinking - through stop words will the titles be seen as duplicates: letter bracelet letter a bracelet letter i bracelet
Intermediate & Advanced SEO | | MickEdwards0 -
Robots.txt 404 problem
I've just set up a wordpress site with a hosting company who only allow you to install your wordpress site in http://www.myurl.com/folder as opposed to the root folder. I now have the problem that the robots.txt file only works in http://www.myurl./com/folder/robots.txt Of course google is looking for it at http://www.myurl.com/robots.txt and returning a 404 error. How can I get around this? Is there a way to tell google in webmaster tools to use a different path to locate it? I'm stumped?
Intermediate & Advanced SEO | | SamCUK0 -
How does Google see an article in two languages?
Hi, We are translating our articles into French (they are already in English) and are considering Cantonese & Mandarin. How does Google see this? Say I post an article on Diabetes Symptoms in English, Cantonese and French. Same article, different languages. Does Google look at this as three separate articles, ranking you uniquely, or does it count as one article? Thanks, Erin
Intermediate & Advanced SEO | | erinhealthchoices0