Page loads fine for users but returns a 404 for Google & Moz
-
I have an e-commerce website that is built using Wordpress and the WP E-commerce plug-in, the products have always worked fine and the pages when you view them in a browser work fine and people can purchase the products with no problems.
However in the Google merchant feed and in the Moz crawl diagnostics certain product pages are returning a 404 error message and I can't work out why, especially as the pages load fine in the browser.
I had a look at the page headers and can see when the page does load the initial request does return a 404 error message, then every other request goes through and loads fine. Can anyone help me as to why this is happening?
A link to the product I have been using to test is: http://earthkindoriginals.co.uk/organic-clothing/lounge-wear/organic-tunic-top/
Here is a part of the header dump that I did:
http://earthkindoriginals.co.uk/organic-clothing/lounge-wear/organic-tunic-top/
GET /organic-clothing/lounge-wear/organic-tunic-top/ HTTP/1.1
Host: earthkindoriginals.co.uk
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:21.0) Gecko/20100101 Firefox/21.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,/;q=0.8
Accept-Language: en-gb,en;q=0.5
Accept-Encoding: gzip, deflate
Cookie: __utma=159840937.1804930013.1369831087.1373619597.1373622660.4; __utmz=159840937.1369831087.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); wp-settings-1=imgsize%3Dmedium%26hidetb%3D1%26editor%3Dhtml%26urlbutton%3Dnone%26mfold%3Do%26align%3Dcenter%26ed_size%3D160%26libraryContent%3Dbrowse; wp-settings-time-1=1370438004; __utmb=159840937.3.10.1373622660; PHPSESSID=e6f3b379d54c1471a8c662bf52c24543; __utmc=159840937
Connection: keep-alive
HTTP/1.1 404 Not Found
Date: Fri, 12 Jul 2013 09:58:33 GMT
Server: Apache
X-Powered-By: PHP/5.2.17
X-Pingback: http://earthkindoriginals.co.uk/xmlrpc.php
Expires: Wed, 11 Jan 1984 05:00:00 GMT
Cache-Control: no-cache, must-revalidate, max-age=0
Pragma: no-cache
Vary: Accept-Encoding
Content-Encoding: gzip
Content-Length: 6653
Connection: close
Content-Type: text/html; charset=UTF-8 -
Thanks for the help guys, it is good to actually have a direction to look in now, I was just completely stuck before. I will post any updates I have.
-
Hello,
The status returned is 404 not found, this is independent of whether the page is loaded or not.
There is something that is generating that code either htaccess, some php code, maybe some redirection, a misconfigured rewrite, look for what can be, with that code, pages are not indexed.
Sorry for my english.
Best regards,
Carlos
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google treats pages from main website and sub folder/sub directory differently?
Hi all, We have a sub directory like website.com/help/. This is a differently hosted and served content. So I wonder how Google treats pages from this sub directory. Will the same priority will be given for these pages compared to main website pages? Will there be any ranking difference when same page is from main website or sub directory. I mean like below page. Page from main website: www.website.com/page1/ Page from sub-directory: www.website.com/help/page1/ So which page will have more importance in search results?
Intermediate & Advanced SEO | | vtmoz0 -
My site shows 503 error to Google bot, but can see the site fine. Not indexing in Google. Help
Hi, This site is not indexed on Google at all. http://www.thethreehorseshoespub.co.uk Looking into it, it seems to be giving a 503 error to the google bot. I can see the site I have checked source code Checked robots Did have a sitemap param. but removed it for testing GWMT is showing 'unreachable' if I submit a site map or fetch Any ideas on how to remove this error? Many thanks in advance
Intermediate & Advanced SEO | | SolveWebMedia0 -
What is Google supposed to return when you submit an image URL into Fetch as Google? Is a few lines of readable text followed by lots of unreadable text normal?
I am seeing something like this (Is this normal?): HTTP/1.1 200 OK
Intermediate & Advanced SEO | | Autoboof
Server: nginx
Content-Type: image/jpeg
X-Content-Type-Options: nosniff
Last-Modified: Fri, 13 Nov 2015 15:23:04 GMT
Cache-Control: max-age=1209600
Expires: Fri, 27 Nov 2015 15:23:55 GMT
X-Request-ID: v-8dd8519e-8a1a-11e5-a595-12313d18b975
X-AH-Environment: prod
Content-Length: 25505
Accept-Ranges: bytes
Date: Fri, 13 Nov 2015 15:24:11 GMT
X-Varnish: 863978362 863966195
Age: 16
Via: 1.1 varnish
Connection: keep-alive
X-Cache: HIT
X-Cache-Hits: 1 ����•JFIF••••��;CREATOR: gd-jpeg v1.0 (using IJG JPEG v80), quality = 75
��C•••••••••• •
••
••••••••• $.' ",#(7),01444'9=82<.342��C• ••••
•2!!22222222222222222222222222222222222222222222222222��•••••v••"••••••��••••••••••••••••
•���•••••••••••••}•••••••!1A••Qa•"q•2���•#B��•R��$3br�
••••%&'()*456789:CDEFGHIJSTUVWXYZcdefghijstuvwxyz���������������������������������������������������������������������������•••••••••••••••••••
•���••••••••••••••w••••••!1••AQ•aq•"2�••B���� #3R�•br�0 -
Help with 404 pages
Hello everyone, A few days back, we have permanently removed 3 main categories from our E-commerce website and because of that our more than 50k URLs are showing 404 error (according to Google Search Console). What are the good practices to handle such extensively 404 pages? Please help!!
Intermediate & Advanced SEO | | Obbserv0 -
Our client's web property recently switched over to secure pages (https) however there non secure pages (http) are still being indexed in Google. Should we request in GWMT to have the non secure pages deindexed?
Our client recently switched over to https via new SSL. They have also implemented rel canonicals for most of their internal webpages (that point to the https). However many of their non secure webpages are still being indexed by Google. We have access to their GWMT for both the secure and non secure pages.
Intermediate & Advanced SEO | | RosemaryB
Should we just let Google figure out what to do with the non secure pages? We would like to setup 301 redirects from the old non secure pages to the new secure pages, but were not sure if this is going to happen. We thought about requesting in GWMT for Google to remove the non secure pages. However we felt this was pretty drastic. Any recommendations would be much appreciated.0 -
How should I go about repairing 400,000 404 error pages?
My thinking is to make a list of most linked to and most trafficked error pages, and just redirect those, but I don't know how to get all that data because i can't even download all the error pages from Webmaster Tools, and even then, how would i get backlink data except by checking each link manually? Are there any detailed step-by-step instructions on this that I missed in my Googling? Thanks for reading!!
Intermediate & Advanced SEO | | DA20130 -
How does google count a menu on each page
Hello, Just wondering how google treats the TOp and bottom menu that you see on each page of a website ? Does it count it on all the pages in terms of link juice, or is it just there for user experience and only what it counts are the links in the content of a page or on the side ? Thank you,
Intermediate & Advanced SEO | | seoanalytics0 -
Google Freshness Update & Ecommerce Site Strategies
Just curious what other ecommerce SEO's are doing to battle fresh content. We've been having our clients work on internal blogs, adding articles one click away from landing pages, and implement product reviews when possible but I don't know that it's enough. Our bigger customers have landing pages (usually category pages) with very competitive keywords. So my main issue is what to do with fresh content on category pages.. I've toyed with the idea of having the landing page content re written every now and then. We used to use a blog parser to bring snippits of comments from the blog into landing pages but I believe that to be a problem with duplicate content. News snippits from other sites don't seem beneficial either. Anyone have any other ideas?
Intermediate & Advanced SEO | | iAnalyst.com0