Sitemap issue? 404's & 500's are regenerating?
-
I am using the WordPress SEO plugin by Yoast to generate a sitemap on http://www.atozqualityfencing.com. Last month, I had an associate create redirects for over 200 404 errors. She did this via the .htaccess file. Today, there are the same amount of 404s along with a number of 503 errors. This new Wordpress website was constructed on a subdirectory and made live by simply entering some code into the .htaccess file in order to direct browsers to the content we wanted live. In other words, the content actually resides in a subdirectory titled "newsite" but is shown live on the main url.
Can you tell me why we are having these 404 & 503 errors? I have no idea where to begin looking.
-
You likely have a .htaccess issue causing a rewrite error. You may want to examine or replace your .htaccess with a default. Also, I've seen some plugins cause this error.
What is happening is this:
http://www.atozqualityfencing.com/newsite
is sent to:
http://www.atozqualityfencing.com/newsite/
Note the trailing slash.
But that page is returning a 404 error.
If I go to
http://www.atozqualityfencing.com/newsite/index.php it redirects to
http://www.atozqualityfencing.com/newsite/
So there is likely something wrong in the redirect rules. I would try disabling all plugins. If that fails, compare the current htaccess to a default one and remove any modifications.
.
-
Wondering if anyone else out there has some insight as to whether the information in my previous post seems to be correct.
-
Oye, Jeff - this is a little bit over my head so bear with me as I work it through.
I went to redbot.org and entered the url of where the main website is actually living (http://www.atozqualityfencing.com/newsite). I received this information:
HTTP/1.1 301 Moved Permanently Date: Sun, 24 Aug 2014 14:56:10 GMT Server: Apache Location: [http://www.atozqualityfencing.com/newsite/](https://redbot.org/?uri=http://www.atozqualityfencing.com/newsite/&req_hdr=Referer%3Ahttp://www.atozqualityfencing.com/newsite) Cache-Control: max-age=3600 Expires: Sun, 24 Aug 2014 15:56:10 GMT Content-Length: 326 Keep-Alive: timeout=5, max=100 Connection: Keep-Alive Content-Type: text/html; charset=iso-8859-1 When I clicked on the url listed under Location above, I receive the following information:
HTTP/1.1 404 Not Found Date: Sun, 24 Aug 2014 14:59:59 GMT Server: Apache X-Pingback: http://www.atozqualityfencing.com/newsite/xmlrpc.php Expires: Wed, 11 Jan 1984 05:00:00 GMT Cache-Control: no-cache, must-revalidate, max-age=0 Pragma: no-cache Vary: Accept-Encoding,User-Agent Keep-Alive: timeout=5, max=100 Connection: Keep-Alive Transfer-Encoding: chunked Content-Type: text/html; charset=UTF-8
This has me confused and I wondering if the method used for making the revised website is either not good or is missing something. Here are the articles that were followed for "moving" the newsite redesign to the live url. ``` [http://codex.wordpress.org/Giving_WordPress_Its_Own_Directory](http://codex.wordpress.org/Giving_WordPress_Its_Own_Directory) [http://codex.wordpress.org/Moving_WordPress#When_Your_Domain_Name_or_URLs_Change](http://codex.wordpress.org/Moving_WordPress#When_Your_Domain_Name_or_URLs_Change) ``` Can you provide any further assistance? Thanks, Janet ```
-
A 503 error is a service unavailable error. I have seen situations where redirects are incorrect and loop. Depending on the hosting setup, this can trigger various HTTP error codes.
The best way to debug this is by looking at your Apache access logs. Scan your logs for the 503 errors. Pay attention to the URL being requested as well as the referring URL.
Very likely, there's some looping process and in cases when Apache runs on FastCGI, you can get a 503 error due to too many processes being triggered.
Also, due to how WP handles 404's, I've seen many plugins mask underlying causes. So if you have any plugins that impact error handling, you may need to remove those while debugging.
You can also use http://www.redbot.org/ to check the headers for any page that should be redirected. That tool should return a Location header with a URL. Visit that Location URL in your browser and make sure it resolves.
The goal here is to try to replicate the behavior. Once you can replicate the behavior, dig into your redirect/rewrite rules and examine the logic to determine why you are seeing the loops or failures.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Sitemap issue - Tons of 404 errors
We've recreated a client site in a subdirectory (mysite.com/newsite) of his domain and when it was ready to go live, added code to the htaccess file in order to display the revamped website on the main url. These are the directions that were followed to do this: http://codex.wordpress.org/Giving_WordPress_Its_Own_Directory and http://codex.wordpress.org/Moving_WordPress#When_Your_Domain_Name_or_URLs_Change. This has worked perfectly except that we are now receiving a lot of 404 errors am I'm wondering if this isn't the root of our evil. This is a WordPress self-hosted website and we are actively using the WordPress SEO plugin that creates multiple folders with only 50 links in each. The sitemap_index.xml file tests well in Google Analytics but is pulling a number of links from the subdirectory folder. I'm wondering if it really is the manner in which we made the site live that is our issue or if there is another problem that I cannot see yet. What is the best way to attack this issue? Any clues? The site in question is www.atozqualityfencing.com https://wordpress.org/plugins/wordpress-seo/
Technical SEO | | JanetJ0 -
What's the best Blogging platform
A year ago an SEO specialist evaluated my Wordpress site and said she had seen lower rankings for Wordpress sites--in general. We moved our site off any cms and design in html 5. Our blog, however, is still on Wordpress. I'm thinking about moving to the Ghost platform b/c I only a blog. The drawbacks are one author, no recent post lists, no meta tags. Is it worth it to move the site off Wordpress. Will it affect my rankings much if I have great content? Does anyone have experience with or opinions on Ghost?
Technical SEO | | RoxBrock0 -
Http to https - is a '302 object moved' redirect losing me link juice?
Hi guys, I'm looking at a new site that's completely under https - when I look at the http variant it redirects to the https site with "302 object moved" within the code. I got this by loading the http and https variants into webmaster tools as separate sites, and then doing a 'fetch as google' across both. There is some traffic coming through the http option, and as people start linking to the new site I'm worried they'll link to the http variant, and the 302 redirect to the https site losing me ranking juice from that link. Is this a correct scenario, and if so, should I prioritise moving the 302 to a 301? Cheers, Jez
Technical SEO | | jez0000 -
How to solve the meta : A description for this result is not available because this site's robots.txt. ?
Hi, I have many URL for commercialization that redirects 301 to an actual page of my companies' site. My URL provider say that the load for those request by bots are too much, they put robots text on the redirection server ! Strange or not? Now I have a this META description on all my URL captains that redirect 301 : A description for this result is not available because this site's robots.txt. If you have the perfect solutions could you share it with me ? Thank You.
Technical SEO | | Vale70 -
404's and duplicate content.
I have real estate based websites that add new pages when new listings are added to the market and then deletes pages when the property is sold. My concern is that there are a significant amount of 404's created and the listing pages that are added are going to be the same as others in my market who use the same IDX provider. I can go with a different IDX provider that uses IFrame which doesn't create new pages but I used a IFrame before and my time on site was 3min w/ 2.5 pgs per visit and now it's 7.5 pg/visit with 6+min on the site. The new pages create new content daily so is fresh content and better on site metrics (with the 404's) better or less 404's, no dup content and shorter onsite metrics better? Any thoughts on this issue? Any advice would be appreciated
Technical SEO | | AnthonyLasVegas0 -
Blocking URL's with specific parameters from Googlebot
Hi, I've discovered that Googlebot's are voting on products listed on our website and as a result are creating negative ratings by placing votes from 1 to 5 for every product. The voting function is handled using Javascript, as shown below, and the script prevents multiple votes so most products end up with a vote of 1, which translates to "poor". How do I go about using robots.txt to block a URL with specific parameters only? I'm worried that I might end up blocking the whole product listing, which would result in de-listing from Google and the loss of many highly ranked pages. DON'T want to block: http://www.mysite.com/product.php?productid=1234 WANT to block: http://www.mysite.com/product.php?mode=vote&productid=1234&vote=2 Javacript button code: onclick="javascript: document.voteform.submit();" Thanks in advance for any advice given. Regards,
Technical SEO | | aethereal
Asim0 -
Issue with 'Crawl Errors' in Webmaster Tools
Have an issue with a large number of 'Not Found' webpages being listed in Webmaster Tools. In the 'Detected' column, the dates are recent (May 1st - 15th). However, looking clicking into the 'Linked From' column, all of the link sources are old, many from 2009-10. Furthermore, I have checked a large number of the source pages to double check that the links don't still exist, and they don't as I expected. Firstly, I am concerned that Google thinks there is a vast number of broken links on this site when in fact there is not. Secondly, why if the errors do not actually exist (and never actually have) do they remain listed in Webmaster Tools, which claims they were found again this month?! Thirdly, what's the best and quickest way of getting rid of these errors? Google advises that using the 'URL Removal Tool' will only remove the pages from the Google index, NOT from the crawl errors. The info is that if they keep getting 404 returns, it will automatically get removed. Well I don't know how many times they need to get that 404 in order to get rid of a URL and link that haven't existed for 18-24 months?!! Thanks.
Technical SEO | | RiceMedia0 -
Crawl issues/ .htacess issues
My site is getting crawl errors inside of google webmaster tools. Google believe a lot of my links point to index.html when they really do not. That is not the problem though, its that google can't give credit for those links to any of my pages. I know I need to create a rule in the .htacess but the last time I did it I got an error. I need some assistance on how to go about doing this, I really don't want to lose the weight of my links. Thanks
Technical SEO | | automart0