Page loads fine for users but returns a 404 for Google & Moz
-
I have an e-commerce website that is built using Wordpress and the WP E-commerce plug-in, the products have always worked fine and the pages when you view them in a browser work fine and people can purchase the products with no problems.
However in the Google merchant feed and in the Moz crawl diagnostics certain product pages are returning a 404 error message and I can't work out why, especially as the pages load fine in the browser.
I had a look at the page headers and can see when the page does load the initial request does return a 404 error message, then every other request goes through and loads fine. Can anyone help me as to why this is happening?
A link to the product I have been using to test is: http://earthkindoriginals.co.uk/organic-clothing/lounge-wear/organic-tunic-top/
Here is a part of the header dump that I did:
http://earthkindoriginals.co.uk/organic-clothing/lounge-wear/organic-tunic-top/
GET /organic-clothing/lounge-wear/organic-tunic-top/ HTTP/1.1
Host: earthkindoriginals.co.uk
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:21.0) Gecko/20100101 Firefox/21.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,/;q=0.8
Accept-Language: en-gb,en;q=0.5
Accept-Encoding: gzip, deflate
Cookie: __utma=159840937.1804930013.1369831087.1373619597.1373622660.4; __utmz=159840937.1369831087.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); wp-settings-1=imgsize%3Dmedium%26hidetb%3D1%26editor%3Dhtml%26urlbutton%3Dnone%26mfold%3Do%26align%3Dcenter%26ed_size%3D160%26libraryContent%3Dbrowse; wp-settings-time-1=1370438004; __utmb=159840937.3.10.1373622660; PHPSESSID=e6f3b379d54c1471a8c662bf52c24543; __utmc=159840937
Connection: keep-alive
HTTP/1.1 404 Not Found
Date: Fri, 12 Jul 2013 09:58:33 GMT
Server: Apache
X-Powered-By: PHP/5.2.17
X-Pingback: http://earthkindoriginals.co.uk/xmlrpc.php
Expires: Wed, 11 Jan 1984 05:00:00 GMT
Cache-Control: no-cache, must-revalidate, max-age=0
Pragma: no-cache
Vary: Accept-Encoding
Content-Encoding: gzip
Content-Length: 6653
Connection: close
Content-Type: text/html; charset=UTF-8 -
Thanks for the help guys, it is good to actually have a direction to look in now, I was just completely stuck before. I will post any updates I have.
-
Hello,
The status returned is 404 not found, this is independent of whether the page is loaded or not.
There is something that is generating that code either htaccess, some php code, maybe some redirection, a misconfigured rewrite, look for what can be, with that code, pages are not indexed.
Sorry for my english.
Best regards,
Carlos
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Product Schema & Google Guidelines
Hi We have product mark up on our site, data-vocab rather than schema. I can't see it showing in Google SERPs, but when testing it appears to be correct. Are Google still selective with what schema they show for a site? Thanks
Intermediate & Advanced SEO | | BeckyKey0 -
Why Google is not showing right title tags of my website inner pages?
Hello Everyone, I have a same problem with my 3 websites that Google is not showing right title tags of inner pages of my websites goldcoast-plumbers.com: http://screencast.com/t/2AEzDcoTkWF accountants-goldcoast.com.au: metalrecyclers-brisbane.com.au One common thing is all these websites is All in one SEO Pack Plugin for SEO Is it a problem? Thanks in advance for your help! Regards
Intermediate & Advanced SEO | | Asjad0 -
What's the best way to redirect categories & paginated pages on a blog?
I'm currently re-doing my blog and have a few categories that I'm getting rid of for housecleaning purposes and crawl efficiency. Each of these categories has many pages (some have hundreds). The new blog will also not have new relevant categories to redirect them to (1 or 2 may work). So what is the best place to properly redirect these pages to? And how do I handle the paginated URLs? The only logical place I can think of would be to redirect them to the homepage of the blog, but since there are so many pages, I don't know if that's the best idea. Does anybody have any thoughts?
Intermediate & Advanced SEO | | kking41200 -
Add noindex,nofollow prior to removing pages resulting in 404's
We're working with another site that unfortunately due to how their website has been programmed creates a bit of a mess. Whenever an employee removes a page from their site through their homegrown 'content management system', rather than 301'ing to another location on their site, the page is deleted and results in a 404. The interim question until they implement a better solution in managing their website is: Should they first add noindex,nofollow to the pages that are scheduled to be removed. Then once they are removed, they become 404's? Of note, it is possible that some of these pages will be used again in the future, and I would imagine they could submit them to Google through Webmaster Tools and adding the pages to their sitemap.
Intermediate & Advanced SEO | | Prospector-Plastics0 -
Howcome Google is indexing one day 2500 pages and the other day only 150 then 2000 again ect?
This is about an big affiliate website of an customer of us, running with datafeeds... Bad things about datafeeds: Duplicate Content (product descriptions) Verrryyyy Much (thin) product pages (sometimes better to noindex, i know, but this customer doesn't want to do that)
Intermediate & Advanced SEO | | Zanox0 -
Huge google index with un-relevant pages
Hi, i run a site about sport matches, every match has a page and the pages are generated automatically from the DB. pages are not duplicated, but over time some look a little bit similar. after a match finishes it has no internal links or sitemap entry, but it's reachable by direct URL and continues to be on google index. so over time we have more than 100,000 indexed pages. since past matches have no significance and they're not linked and a match can repeat and it may look like duplicate content....what you suggest us to do: when a match is finished - not linked, but appears on the index and SERP 301 redirect the match Page to the match Category which is a higher hierarchy and is always relevant? use rel=canonical to the match Category do nothing.... *301 redirect will shrink my index status, some say a high index status is good... *is it safe to 301 redirect 100,000 pages at once - wouldn't it look strange to google? *would canonical remove the past matches pages from the index? what do you think? Thanks, Assaf.
Intermediate & Advanced SEO | | stassaf0 -
Killing 404 errors on our site in Google's index
Having moved a site across to Magento, obviously re-directs were a large part of that, ensuring all the old products and categories linked up correctly with the new site structure. However, we came up against an issue where we needed to add, delete, then re-add products. This, coupled with a misunderstanding of the csv upload processing, meant that although the old urls redirected, some of the new Magento urls changed and then didn't redirect: For Example: mysite/product would get deleted re-added and become: mysite/product-1324 We now know what we did wrong to ensure it doesn't continue to happen if we weret o delete and re-add a product, but Google contains all these old URLs in its index which has caused people to search for products on Google, click through, then land on the 404 page - far from ideal. We kind of assumed, with continual updating of sitemaps and time, that Google would realise and update the URL accordingly. But this hasn't happened - we are still getting plenty of 404 errors on certain product searches (These aren't appearing in SEOmoz, there are no links to the old URL on the site, only Google, as the index contains the old URL). Aside from going through and finding the products affected (no easy task), and setting up redirects for each one, is there any way we can tell Google 'These URLs are no longer a thing, forget them and move on, let's make a fresh start and Happy New Year'?
Intermediate & Advanced SEO | | seanmccauley0 -
Use of rel=canonical to view all page & No follow links
Hey, I have a couple of questions regarding e-commerce category pages and filtering options: I would like to implement the rel=canonical to the view all page as suggested on this article on googlewebmastercentral. If you go on one of my category pages you will see that both the "next page link" and the "view all" links are nofollowed. Is that a mistake? How does nofoolow combines with canonical view all? Is it a good thing to nofollow the "sorty by" pages or should I also use Noindex for them?
Intermediate & Advanced SEO | | Ypsilon0