Sitemap issue - Tons of 404 errors
-
We've recreated a client site in a subdirectory (mysite.com/newsite) of his domain and when it was ready to go live, added code to the htaccess file in order to display the revamped website on the main url. These are the directions that were followed to do this: http://codex.wordpress.org/Giving_WordPress_Its_Own_Directory and http://codex.wordpress.org/Moving_WordPress#When_Your_Domain_Name_or_URLs_Change. This has worked perfectly except that we are now receiving a lot of 404 errors am I'm wondering if this isn't the root of our evil.
This is a WordPress self-hosted website and we are actively using the WordPress SEO plugin that creates multiple folders with only 50 links in each. The sitemap_index.xml file tests well in Google Analytics but is pulling a number of links from the subdirectory folder.
I'm wondering if it really is the manner in which we made the site live that is our issue or if there is another problem that I cannot see yet. What is the best way to attack this issue? Any clues?
The site in question is www.atozqualityfencing.com
-
Thanks again for the awesome help. I really appreciate your time and effort!!
-
I don't think it would snowball. It should be the end of the issue, as I think google will have found all of the pages it is going to find. You might have some more popup like tags pages and thing like that, but nothing major. I don't know if your webmaster is letting you see the webmaster tools or not, but it has an error date of when it last detected the error. It should look like this, http://screencast.com/t/5a9lpC6o then you can click on the link and pull this window up, http://screencast.com/t/boyAdXGoOLl From there you can see if the links were internal or external that were triggering the 404 pages. It could very well be that external backlinks were triggering them. If they are internal links, to be safe I would search the source of the pages for the links.
Also, Moz's crawler should pick up the 404 errors and let you know if it is still because of links on the site. The 301 redirects will handle the issue if the links were from the old site, but if the links are because of internal links on the new site that are broken, I would find them and fix them with Moz's crawler or Ravens Crawler.
-
Thank you for your insight Lesley! If we do as you suggest, will that be the end of the issue or could it snowball? Wouldn't you think that if there were changes to the site after Google indexed it the next crawl by Google would correct it? Is there a way to get Google to crawl it immediately? Probably not, huh? lol
-
This one is really difficult to tell what has actually gone wrong. I am thinking there might have been changes to the site once google indexed the site for the first time and the point it is at now. I went to the internet archive and I could not see many of the pages, so I do not really know.
The fix however is to write 301 redirects for all of the pages that are pulling a 404, but there is a page that represents them. It looks like some of the pages might have had a url change and others might have been done away with.
-
Thanks for your reply, Lesley. I am checking with the developer as to which exact steps she took to make the site live from a subdirectory. Some of the 404 pages include:
http://www.atozqualityfencing.com/newsite/feed/
http://www.atozqualityfencing.com/fencing-styles/
http://www.atozqualityfencing.com/fence-materials/conact
http://www.atozqualityfencing.com/newsite/conact/
http://www.atozqualityfencing.com/faq/wood-fencing-gallery
http://www.atozqualityfencing.com/faq/vinyl-fencing-gallery
http://www.atozqualityfencing.com/faq/structures-gallery
http://www.atozqualityfencing.com/faq/horse-fencing-gallery
http://www.atozqualityfencing.com/faq/horse-shelter-gallery
http://www.atozqualityfencing.com/conact
http://www.atozqualityfencing.com/author/aaron-smith/wood-fencing-galleryThere are a total of 210 of them.
What other information can I provide to help get this figured out?
-
It is really hard to tell without seeing the errors. Are the pages at the same address as the previous pages? Did you redirect them? Is there something internally wrong that is hard to tell? It would be easier to diagnose if we could the a list of the 404 pages.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
I'm struggling to understand (and fix) why I'm getting a 404 error. The URL includes this "%5Bnull%20id=43484%5D" but I cannot find that anywhere in the referring URL. Does anyone know why please? Thanks
Can you help with how to fix this 404 error please? It appears that I have a redirect from one page to the other, although the referring page URL works, but it appears to be linking to another URL with this code at the end of the the URL - %5Bnull%20id=43484%5D that I'm struggling to find and fix. Thanks
Technical SEO | | Nichole.wynter20200 -
Robots.txt error
Moz Crawler is not able to access the robots.txt due to server error. Please advice on how to tackle the server error.
Technical SEO | | Shanidel0 -
Fetch as Google issues
HI all, Recently, well a couple of months back, I finally got around to switching our sites over to HTTPS://. In terms of rankings etc all looks fine and we have not move about much, only the usual fluctuations of a place or two on a daily basis in a competitive niche. All links have been updated, redirects in place, the usual https domain migration stuff. I am however, troubled by one thing! I cannot for love nor money get Google to fetch my site in GSC. No matter what I have tried it continues to display "Temporarily unreachable". I have checked the robots.txt and it is on a new https:// profile in GSC. Has anyone got a clue as I am stumped! Have I simply become blinded by looking too much??? Site in Q. caravanguard co uk. Cheers and looking forward to your comments.... Tim
Technical SEO | | TimHolmes0 -
What should I do with all these 404 pages?
I have a website that Im currently working on that has been fairly dormant for a while and has just been given a face lift and brought back to life. I have some questions below about dealing with 404 pages. In Google WMT/search console there are reports of thousands of 404 pages going back some years. It says there are over 5k in total but I am only able to download 1k or so from WMT it seems. I ran a crawl test with Moz and the report it sent back only had a few hundred 404s in, why is that? Im not sure what to do with all the 404 pages also, I know that both Google and Moz recommend a mixture of leaving some as 404s and redirect others and Id like to know what the community here suggests. The 404s are a mix of the following: Blog posts and articles that have disappeared (some of these have good back-links too) Urls that look like they used to belong to users (the site used to have a forum) which where deleted when the forum was removed, some of them look like they were removed for spam reasons too eg /user/buy-cheap-meds-online and others like that Other urls like this /node/4455 (or some other random number) Im thinking I should permanently redirect the blog posts to the homepage or the blog but Im not sure what to do about all the others? Surely having so many 404s like this is hurting my crawl rate?
Technical SEO | | linklander0 -
Sitemap indexation
3 days ago I sent in a new sitemap for a new platform. Its 23.412 pages but until now its only 4 pages (!!) that are indexed according to the Webmaster Tools. Why so few? Our stage-enviroment got indexed (more than 50K pages) in a few days by a mistake.
Technical SEO | | Morten_Hjort0 -
Seomoz pages error
Hi
Technical SEO | | looktouchfeel
I have a problem with seomoz, it is saying my website http://www.clearviewtraffic.com has page errors on 19,680 pages. Most of the errors are for duplicate page titles. The website itself doesn't even have 100 pages. Does anyone know how I can fix this? Thanks Luke0 -
Massive Increase in 404 Errors in GWT
Last June, we transitioned our site to the Magento platform. When we did so, we naturally got an increase in 404 errors for URLs that were not redirected (for a variety of reasons: we hadn't carried the product for years, Google no longer got the same string when it did a "search" on the site, etc.). We knew these would be there and were completely fine with them. We also got many 404s due to the way Magento had implemented their site map (putting in products that were not visible to customers, including all the different file paths to get to a product even though we use a flat structure, etc.). These were frustrating but we did custom work on the site map and let Google resolve those many, many 440s on its own. Sure enough, a few months went by and GWT started to clear out the 404s. All the poor, nonexistent links from the site map and missing links from the old site - they started disappearing from the crawl notices and we slowly went from some 20k 404s to 4k 404s. Still a lot, but we were getting there. Then, in the last 2 weeks, all of those links started showing up again in GWT and reporting as 404s. Now we have 38k 404s (way more than ever reported). I confirmed that these bad links are not showing up in our site map or anything and I'm really not sure how Google found these again. I know, in general, these 404s don't hurt our site. But it just seems so odd. Is there any chance Google bots just randomly crawled a big ol' list of outdated links it hadn't tried for awhile? And does anyone have any advice for clearing them out?
Technical SEO | | Marketing.SCG0 -
Duplicate Homepage issue
SEOMOZ says my site has two homepages: www.mysite.com www.mysite.com/ When you go to "www.mysite.com/" the URL changes to "www.mysite.com" Why is this happening and what can I do about it?
Technical SEO | | LucasF0