404 page not found after site migration
-
Hi,
A question from our developer.
We have an issue in Google Webmaster Tools.
A few months ago we killed off one of our e-commerce sites and set up another to replace it. The new site uses different software on a different domain. I set up a mass 301 redirect that would redirect any URLs to the new domain, so domain-one.com/product would redirect to domain-two.com/product. As it turns out, the new site doesn’t use the same URLs for products as the old one did, so I deleted the mass 301 redirect.
We’re getting a lot of URLs showing up as 404 not found in Webmaster tools. These URLs used to exist on the old site and be linked to from the old sitemap. Even URLs that are showing up as 404 recently say that they are linked to in the old sitemap. The old sitemap no longer exists and has been returning a 404 error for some time now. Normally I would set up 301 redirects for each one and mark them as fixed, but there are almost quarter of a million URLs that are returning 404 errors, and rising.
I’m sure there are some genuine problems that need sorting out in that list, but I just can’t see them under the mass of errors for pages that have been redirected from the old site. Because of this, I’m reluctant to set up a robots file that disallows all of the 404 URLs.
The old site is no longer in the index. Searching google for site:domain-one.com returns no results.
Ideally, I’d like anything that was linked from the old sitemap to be removed from webmaster tools and for Google to stop attempting to crawl those pages.
Thanks in advance.
-
I agree that the 301 redirect would be your best option as you can pass along not only users but the bots to the right page.. You may need to get a developer in to write some regular expressions to parse the incoming request and then automatically find the correct new URL. I have worked on sites with a large number of pages and using some sort of automation is the only way to go.
That said, if you simply want to kill the old URLs you can show the 404s or 410s. As you mention, then you end up with a bunch of 404 errors in GWT. I have been there too, it's like damned if you do, damned if you don't. We had some URLs that were tracking URLs from an old site and we are now here a year later (been showing 410s for over a year on the old tracking URLs) they still show up in GWT as errors.
We are trying a new solution for how to remove these URLs from the index without getting 404 errors. We show a 200 and then we put up a minimal html page with the meta robots noindex tag.
http://support.google.com/webmasters/bin/answer.py?hl=en&answer=93710
"When we see the noindex meta tag on a page, Google will completely drop the page from our search results, even if other pages link to it. "
So, we allow Google to find the page, get a 200 (so no 404 errors), but then use the meta noindex tag to tell Google to remove it from the index and stop crawling the page.
Remember, this is the "nuclear" option. You only want to do this to remove the pages from the Google index. Someone mentioned using GWT to remove URLs, but if I remember correctly, you only have so many pages you can do this with at a time.
If you list the files within the robots.txt. Google will not spider the files, but then if you remove the page from robots.txt file, they will start to try spidering again. I have seen Google come back a year later on URLs when I take them out of robots. This is what happened to us and so we tried just showing the 410/404, but Google still keeps crawling. We recently moved to this option with the 200/noindexmeta and it seems to be working.
Good luck!
-
You can but the 404s should stop being crawled on their own. There's a webmaster tool that you can use to make that happen faster as well
http://support.google.com/webmasters/bin/answer.py?hl=en&answer=64033
-
Yeah it's a 404 http://www.tester.co.uk/17th-edition-equipment/multifunction-testers/fluke-1651b-multifunction-installation-tester
with over 200,000 404's its a lot to go through and 301. For some reason they it got migrated they just pointed the old url to a new one replacing the root domain name without creating matching url's. Doh.
I was thinking about robot.txt filling them all?
-
A 404 should cause Google to de-index the content. Go to one of the bad URLs and view the headers to make sure that your webserver is returning a status 404 and not just a 404 "page".
As hard and time consuming as it might be, I would still pursue a 301 option. It's the cleanest way to resolve the issue. Just start nibbling at it and you can make a dent. Doing nothing just lets the problem grow.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Anything wrong in linking to homepage from all sub domain pages?
Hi, We have 6 sub domains which are forums, guides, etc. They have their own visitors for the related queries. We are planning to divert some of them to the website to promote our product with latest content. We are planning to add a link from every page of sub domain to our website homepage. This makes additional thousands of internal links flowing to website homepage. Will this kind of internal linking structure hurts? Any risks involved? Thanks
Web Design | | vtmoz1 -
Why is google still crawling my old website pages?
Why is google still looking at my old indexed pages and not my new index. ? Why are they crawling my old website links when none of them are available? How do I overcome these problems?
Web Design | | optimalspaces0 -
Incorporating Spanish Page/Site
We bought an exact match domain (in Spanish) to incorporate with regular website for a particular keyword. This is our first attempt at this, and while we do have Spanish speaking staff that will translate/create a nice, quality page, we're not going to redo everything in Spanish page. Any advice on how to implement this? Do I need to create a whole other website in Spanish? Will that be duplicate content if I do? Can I just set it up to show the first page in Spanish, but if they click on anything else it redirects to our site? I'm pretty clueless on this, so if anything I've suggested is off-the-wall or a violation, I'm really just spit-balling, trying to figure out how to implement this. Thanks, Ruben
Web Design | | KempRugeLawGroup0 -
Many errors from previous ecommerce site. Domain is now just a localized wordpress site.
Many errors from previous ecommerce site. Do I need to redirect every single page that no longer exists at this domain? loveyourcabinets.com used to be loveyourkitchenandbath.com but we have since changed course. We want loveyourkitchenandbath.com to be our local site on Long Island and NYC. Loveyourcabinets.com will be an ecommerce project that I'll be revamping in the coming months. I think Moz as well as Google still has all of the old ecommerce pages indexed. And of course, Moz is shooting me a bunch of error all regarding pages from the ecommerce site that used to be on loveyourkitchenandbath.com. Any thoughts? Commentary? Thx
Web Design | | loveyourkitchen0 -
Tips on website redesign on site with messy URLs?
So I've inherited quite a messy website. It was in drupal and the owner wants it in wordpress. One of the problems is the link paths. Should I try to recreate them exactly? i.e. something/somethingelse/page/ or use redirects (which I'm not confident in doing). Also, some of the pages end in .html, others in a back slash and others without slahes, there's no consistency. Do you have any tips in general? I remember an older seomoz blogpost about successful website relaunches (with press releases and mass emails and stuff being sent out on launch to boot). Thanks!
Web Design | | seonubblet0 -
How do I optimize a site designed to be one scrolling page of content?
Our website uses section ID's as its navigation so all the content is on one page. When you click About Us, the page scrolls down to About Us. Products, the page scrolls to Products section, and etc. I am getting crawl errors for meta descriptions but will this go away once the main domain has this info? We just added the meta keywords and description to the header and since the navigation sections use the same page, I assume it will correct the errors. Any other advice on optimizing for site designs like ours would be great. www.theicecubekit.com is the site. Thanks,
Web Design | | bangbang
Chris0 -
Sub-pages with more links than homepage - bad?
Hi,
Web Design | | rayvensoft
I am working on merging a number of my niche websites into a larger site (301 redirects, phased in over a few months). My question/concern is whether google will penalize the main site when it sees that the homepage has almost no links to it, and that about 10-15 sub-pages have a lot of links back to it. Does anybody have experience with this kind of scenario? Will it create a problem? Theoretically I could spend a year or so building up links to the new main page - building the brand - before doing the 301's. The smaller pages still bring in clients, but it is getting hard to maintain that many micro sites. Thanks in advance for any help.0 -
What Is Our Site Missing Causing Our Former Dominance To Slip?
So we have operated one of our retail sites, BonitaJ.com for many years now. Through a lot of work, link building and optimizing around 2009, we were in a prominent spot on the 1st page in google for just about every main term we were targeting. Towards the end of 2009, nearing December or so, we started slipping here and there, and began being displaced for our main terms by newer sites that according to several factors, don't have near the strength our site holds. And by strength, I simply mean, based on link volume, mozbar stats and many other factors, it seems we should rank well above most, but still find ourselves just hanging to 8-10 positions on page one, and in many cases somewhere on page two for terms it seems like we should be in the top 5 positions for. I believe some of our slippage is due to google's devaluing of many of our incoming links. We achieved our early ranking dominence off a lot of directory links and things like that over time, but ever since 2009 when links began getting devalued we immediately broke into getting quality blog links via LEGIT blog relationships where we'd offer up contests, bloggers would review our products and so on, and these relationships continue through today. We also do a lot of guest blog writing, article postings on various networks, as well as press releases, all with the goal of keeping our link profile happy and healthy. So we still have work to do there, but we're on the right track. So my thought is that to get back over the hump, we simply need to continue with the legit link building methods, but I'm also thinking that maybe we need to improve some things navigationally. Things I was hoping people would chime in on are.... 1. If we're mainly trying to target bridal/wedding related jewelry terms, should we ditch the "Jewelry Sets, Pearl Jewelry & Swarovski Crystal Jewerly" terms from our main navbar. They are featured inside each of the categories, and in the end, we don't rank or pull traffic for them anyway. Would ditching them from the main nav, help pass more juice from home page and other pages to the pages that better target our niche? 2. A while back, we ditched including actual product on each of the main category pages. I'm leaning towards breaking the main category pages up into sections, for instance once on the "Bridal Jewelry" page, it would list each of the sub-cats, with a 5-10 product sampling of the most popular items, with a link that says "view all necklaces" at the end of each sub-section. Do you think that more wise than just trying to direct them into the sub-cats with no actual product offering? 3. Anything else you see glaringly wrong with what we're trying to do? This site is just on the edge of blowing up from a ranking perspective if I can just get some confirmation on some things that I know I should do, but I'm wary due to fear of screwing things up. If I can get some solid feedback, the rest is history.
Web Design | | AarcMediaGroup0