Sitemap issue? 404's & 500's are regenerating?
-
I am using the WordPress SEO plugin by Yoast to generate a sitemap on http://www.atozqualityfencing.com. Last month, I had an associate create redirects for over 200 404 errors. She did this via the .htaccess file. Today, there are the same amount of 404s along with a number of 503 errors. This new Wordpress website was constructed on a subdirectory and made live by simply entering some code into the .htaccess file in order to direct browsers to the content we wanted live. In other words, the content actually resides in a subdirectory titled "newsite" but is shown live on the main url.
Can you tell me why we are having these 404 & 503 errors? I have no idea where to begin looking.
-
You likely have a .htaccess issue causing a rewrite error. You may want to examine or replace your .htaccess with a default. Also, I've seen some plugins cause this error.
What is happening is this:
http://www.atozqualityfencing.com/newsite
is sent to:
http://www.atozqualityfencing.com/newsite/
Note the trailing slash.
But that page is returning a 404 error.
If I go to
http://www.atozqualityfencing.com/newsite/index.php it redirects to
http://www.atozqualityfencing.com/newsite/
So there is likely something wrong in the redirect rules. I would try disabling all plugins. If that fails, compare the current htaccess to a default one and remove any modifications.
.
-
Wondering if anyone else out there has some insight as to whether the information in my previous post seems to be correct.
-
Oye, Jeff - this is a little bit over my head so bear with me as I work it through.
I went to redbot.org and entered the url of where the main website is actually living (http://www.atozqualityfencing.com/newsite). I received this information:
HTTP/1.1 301 Moved Permanently Date: Sun, 24 Aug 2014 14:56:10 GMT Server: Apache Location: [http://www.atozqualityfencing.com/newsite/](https://redbot.org/?uri=http://www.atozqualityfencing.com/newsite/&req_hdr=Referer%3Ahttp://www.atozqualityfencing.com/newsite) Cache-Control: max-age=3600 Expires: Sun, 24 Aug 2014 15:56:10 GMT Content-Length: 326 Keep-Alive: timeout=5, max=100 Connection: Keep-Alive Content-Type: text/html; charset=iso-8859-1 When I clicked on the url listed under Location above, I receive the following information:
HTTP/1.1 404 Not Found Date: Sun, 24 Aug 2014 14:59:59 GMT Server: Apache X-Pingback: http://www.atozqualityfencing.com/newsite/xmlrpc.php Expires: Wed, 11 Jan 1984 05:00:00 GMT Cache-Control: no-cache, must-revalidate, max-age=0 Pragma: no-cache Vary: Accept-Encoding,User-Agent Keep-Alive: timeout=5, max=100 Connection: Keep-Alive Transfer-Encoding: chunked Content-Type: text/html; charset=UTF-8
This has me confused and I wondering if the method used for making the revised website is either not good or is missing something. Here are the articles that were followed for "moving" the newsite redesign to the live url. ``` [http://codex.wordpress.org/Giving_WordPress_Its_Own_Directory](http://codex.wordpress.org/Giving_WordPress_Its_Own_Directory) [http://codex.wordpress.org/Moving_WordPress#When_Your_Domain_Name_or_URLs_Change](http://codex.wordpress.org/Moving_WordPress#When_Your_Domain_Name_or_URLs_Change) ``` Can you provide any further assistance? Thanks, Janet ```
-
A 503 error is a service unavailable error. I have seen situations where redirects are incorrect and loop. Depending on the hosting setup, this can trigger various HTTP error codes.
The best way to debug this is by looking at your Apache access logs. Scan your logs for the 503 errors. Pay attention to the URL being requested as well as the referring URL.
Very likely, there's some looping process and in cases when Apache runs on FastCGI, you can get a 503 error due to too many processes being triggered.
Also, due to how WP handles 404's, I've seen many plugins mask underlying causes. So if you have any plugins that impact error handling, you may need to remove those while debugging.
You can also use http://www.redbot.org/ to check the headers for any page that should be redirected. That tool should return a Location header with a URL. Visit that Location URL in your browser and make sure it resolves.
The goal here is to try to replicate the behavior. Once you can replicate the behavior, dig into your redirect/rewrite rules and examine the logic to determine why you are seeing the loops or failures.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
What's the best way to test Angular JS heavy page for SEO?
Hi Moz community, Our tech team has recently decided to try switching our product pages to be JavaScript dependent, this includes links, product descriptions and things like breadcrumbs in JS. Given my concerns, they will create a proof of concept with a few product pages in a QA environment so I can test the SEO implications of these changes. They are planning to use Angular 5 client side rendering without any prerendering. I suggested universal but they said the lift was too great, so we're testing to see if this works. I've read a lot of the articles in this guide to all things SEO and JS and am fairly confident in understanding when a site uses JS and how to troubleshoot to make sure everything is getting crawled and indexed. https://sitebulb.com/resources/guides/javascript-seo-resources/ However, I am not sure I'll be able to test the QA pages since they aren't indexable and lives behind a login. I will be able to crawl the page using Screaming Frog but that's generally regarded as what a crawler should be able to crawl and not really what Googlebot will actually be able to crawl and index. Any thoughts on this, is this concern valid? Thanks!
Technical SEO | | znotes0 -
Sitemap Rules
Hello there, I have some questions pertaining to sitemaps that I would appreciate some guidance on. 1. Can an XML sitemap contain URLs that are blocked by robots.txt? Logically, it makes sense to me to not include pages blocked by robots.txt but would like some clarity on the matter i.e. will having pages blocked by robots.txt in a sitemap, negatively impact the benefit of a sitemap? 2. Can a XML sitemap include URLs from multiple subdomains? For example: http://www.example.com/www-sitemap.xml would include the home page URL of two other subdomains i.e. http://blog.example.com/ & http://blog2.example.com/ Thanks
Technical SEO | | SEONOW1230 -
Community Discussion - What's been your experience with accessibility?
When Laura Lippay came to me with the idea to write a series of posts on the Moz blog about SEO and accessibility, it really got my gears turning. As the blog manager, I realized I'd been thinking about all sorts of ways to make the blog the best it can be, but accessibility was one place I had yet to explore in-depth. While I have my own goals and projects around this topic churning along in the background, I'd love to hear what the community's done to be inclusive to all users of the Internet. What've you struggled with in terms of making sites you've worked on accessible -- both technically and as an initiative in general? What's often missing that you've become passionate about including? Do you have any big wins you're especially proud of and want to share? Looking forward to reading your thoughts and stories, folks! 🙂
Technical SEO | | FeliciaCrawford1 -
Why is robots.txt blocking URL's in sitemap?
Hi Folks, Any ideas why Google Webmaster Tools is indicating that my robots.txt is blocking URL's linked in my sitemap.xml, when in fact it isn't? I have checked the current robots.txt declarations and they are fine and I've also tested it in the 'robots.txt Tester' tool, which indicates for the URL's it's suggesting are blocked in the sitemap, in fact work fine. Is this a temporary issue that will be resolved over a few days or should I be concerned. I have recently removed the declaration from the robots.txt that would have been blocking them and then uploaded a new updated sitemap.xml. I'm assuming this issue is due to some sort of crossover. Thanks Gaz
Technical SEO | | PurpleGriffon0 -
On March 10 a client's newsroom disappeared out of the SERPS. Any idea why?
For years the newsroom, which is on the subdomain news.davidlerner.com - has ranked #2 for their brand name search. On march 10 it fell out of the SERPs - it is completely gone. What happened? How can I fix this?
Technical SEO | | MeritusMedia0 -
What does it mean by 'blocked by Meta Robot'? How do I fix this?
When i get my crawl diagnostics, I am getting a blocked by Meta Robot, which means that my page is not being indexed in the search engines... obviously this is a major issue for organic traffic!!! What does it actually mean, and how can i fix it?
Technical SEO | | rolls1230 -
A sitemap... What's the purpose?
Hello everybody, my question is really simple: what's the purpose of a sitemap? It's to help the robots to crawl your website but if you're website has a good architecture, the robots will be able to crawl your site easily! Am I wrong? Thank you for yours answers, Jonathan
Technical SEO | | JonathanLeplang0