404 page not found after site migration
-
Hi,
A question from our developer.
We have an issue in Google Webmaster Tools.
A few months ago we killed off one of our e-commerce sites and set up another to replace it. The new site uses different software on a different domain. I set up a mass 301 redirect that would redirect any URLs to the new domain, so domain-one.com/product would redirect to domain-two.com/product. As it turns out, the new site doesn’t use the same URLs for products as the old one did, so I deleted the mass 301 redirect.
We’re getting a lot of URLs showing up as 404 not found in Webmaster tools. These URLs used to exist on the old site and be linked to from the old sitemap. Even URLs that are showing up as 404 recently say that they are linked to in the old sitemap. The old sitemap no longer exists and has been returning a 404 error for some time now. Normally I would set up 301 redirects for each one and mark them as fixed, but there are almost quarter of a million URLs that are returning 404 errors, and rising.
I’m sure there are some genuine problems that need sorting out in that list, but I just can’t see them under the mass of errors for pages that have been redirected from the old site. Because of this, I’m reluctant to set up a robots file that disallows all of the 404 URLs.
The old site is no longer in the index. Searching google for site:domain-one.com returns no results.
Ideally, I’d like anything that was linked from the old sitemap to be removed from webmaster tools and for Google to stop attempting to crawl those pages.
Thanks in advance.
-
I agree that the 301 redirect would be your best option as you can pass along not only users but the bots to the right page.. You may need to get a developer in to write some regular expressions to parse the incoming request and then automatically find the correct new URL. I have worked on sites with a large number of pages and using some sort of automation is the only way to go.
That said, if you simply want to kill the old URLs you can show the 404s or 410s. As you mention, then you end up with a bunch of 404 errors in GWT. I have been there too, it's like damned if you do, damned if you don't. We had some URLs that were tracking URLs from an old site and we are now here a year later (been showing 410s for over a year on the old tracking URLs) they still show up in GWT as errors.
We are trying a new solution for how to remove these URLs from the index without getting 404 errors. We show a 200 and then we put up a minimal html page with the meta robots noindex tag.
http://support.google.com/webmasters/bin/answer.py?hl=en&answer=93710
"When we see the noindex meta tag on a page, Google will completely drop the page from our search results, even if other pages link to it. "
So, we allow Google to find the page, get a 200 (so no 404 errors), but then use the meta noindex tag to tell Google to remove it from the index and stop crawling the page.
Remember, this is the "nuclear" option. You only want to do this to remove the pages from the Google index. Someone mentioned using GWT to remove URLs, but if I remember correctly, you only have so many pages you can do this with at a time.
If you list the files within the robots.txt. Google will not spider the files, but then if you remove the page from robots.txt file, they will start to try spidering again. I have seen Google come back a year later on URLs when I take them out of robots. This is what happened to us and so we tried just showing the 410/404, but Google still keeps crawling. We recently moved to this option with the 200/noindexmeta and it seems to be working.
Good luck!
-
You can but the 404s should stop being crawled on their own. There's a webmaster tool that you can use to make that happen faster as well
http://support.google.com/webmasters/bin/answer.py?hl=en&answer=64033
-
Yeah it's a 404 http://www.tester.co.uk/17th-edition-equipment/multifunction-testers/fluke-1651b-multifunction-installation-tester
with over 200,000 404's its a lot to go through and 301. For some reason they it got migrated they just pointed the old url to a new one replacing the root domain name without creating matching url's. Doh.
I was thinking about robot.txt filling them all?
-
A 404 should cause Google to de-index the content. Go to one of the bad URLs and view the headers to make sure that your webserver is returning a status 404 and not just a 404 "page".
As hard and time consuming as it might be, I would still pursue a 301 option. It's the cleanest way to resolve the issue. Just start nibbling at it and you can make a dent. Doing nothing just lets the problem grow.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
What things i should do to add more attractiveness in my site?
I have 4 years of experience in feedback and survey of different companies for the people who want to learn some from online education about how to do something? and people want to get ease in daily life skills Kroger feedback provides them with a platform to learn everything about surveys and how to drive some to its highest peaks. It all works for People out there through Education SO I have now my own website ready to go but there is a problem in its on-page ranking factors I shall be thankful to you if some from you answer my question What things I should do to add more attractiveness to my site? take a look at my recent view of my site take a look at this Kroger feedback if somebody from you can answer my request??
Web Design | | SEOTOOLS021 -
We added hundreds of pages to our website & restructured the layout to include 3 additional locations within the sub-pages, same brand/domain name. How long could Google take to crawl/index the new pages and rank the keywords used within those pages?
We added hundreds of pages to our website & restructured the layout to include 3 additional locations within the sub-pages, same brand/domain name. The 3 locations old domains were redirected to their sites within our main brand domain. How long could Google take to crawl/index the new pages and rank the keywords used within those pages? And possibly increase our domain authority hopefully? We didn't want our brand spread out over multiple websites/domains on the internet. This also allowed for more content to be written on pages, per each of our locations service's, as well.
Web Design | | BurgSimpson0 -
Tips on website redesign on site with messy URLs?
So I've inherited quite a messy website. It was in drupal and the owner wants it in wordpress. One of the problems is the link paths. Should I try to recreate them exactly? i.e. something/somethingelse/page/ or use redirects (which I'm not confident in doing). Also, some of the pages end in .html, others in a back slash and others without slahes, there's no consistency. Do you have any tips in general? I remember an older seomoz blogpost about successful website relaunches (with press releases and mass emails and stuff being sent out on launch to boot). Thanks!
Web Design | | seonubblet0 -
On site SEO opinions
Hi all, I have been testing different configurations for my on-site SEO for a while now and I think I am pretty much there. However it is always nice to know what other SEO's think about my keyword density and usage. My site is http://www.tomlondonmagic.com I am curious as to what you think regarding landing page content, whether you need lots or text or little text? I have just removed links in the text as I feel I want to keep as much juice on my landing page as possible. Thanks all!
Web Design | | TomLondon0 -
How serious is duplicate page content?
We just launched our site on a new platform - Magento Enterprise. We have a wholesale catalog and and retail catalog. We have up to 3 domains pointing to each product. We are getting tons of duplicate content errors. What are the best practices for dealing with this? Here is an example: mysite.com/product.html mysite.com/category/product.html mysite.com/dynamic-url
Web Design | | devonkrusich0 -
Turning my Design Business site into a site to promote SEO
I need advice on retooling my website for my SEO biz. I have shifted my business model from graphic designer who does websites, to "internet marketing consultant who does graphics too". My main website and domain name is over 10 years old, so I've made the decision to keep it, even though it has no keywords in the name. The name works well for the new business, otherwise. The site has a PR3 and I rank well for small business advertising terms, which gets me graphic design business. I intend to keep doing graphic design, but that is a smaller part of my income. I had considered making 3 satellite sites with keyword domain names to cover my offerings of graphic design SEO, website development, and internet marketing. But am leaning against it for several reasons (that all of us SEO's know) but mainly the fact that I cannot keep up with both working for my clients and blogging on multiple sites and link building for multiple sites. So my question is (you knew there was one coming, right?), what is the best approach to building categories of web development, internet marketing, and SEO into my existing graphic design/advertising oriented website? This is slightly embarrassing to ask as an SEO, but given the multiple approaches possible, and knowing the importance of doing it right the first time, it's best to get an consensus perspective on the BEST approach. My main concerns are the navigation system and the links from the homepage into the site. I have too many pages I've identified as essential to link off of the home page and navigation menus? (Website development, social media marketing, link building, keyword research, pay per click, online advertising, graphic design, brochures, catalogs, Logos, Branding, SEO, keyword research etc.) I've always tried for the ratio of one link off of any page for every 100 words of content. Do I create a home page that is of monster proportions? Do I just have the 4 basic areas linking off the home page then create a "landing zone" of 4 folders and create down from that? I am concerned about URL length as I go deeper with that approach. Or, does it make more sense to have a dozen second-level pages, and not link them all off the home page, and build from beneath (and relying on external juice). Next issue is the nav system. It will be huge. Am I best off just keeping it to 4-6, and creating subnavigation on everypage within the site according to section (PITA)? I've read dozens of blog opinions on how much nav systems do or do not hurt link juice. I've always thought footer links were right next to worthless to pass any juice, but given this situation, does it make sense to make a footer link for each major page (about 20)? Thanks for your opinions.
Web Design | | JCDenver0 -
Infinite Page Scrolling for e-commerce Product Catetegories
Hi There, I would like to know what's the pros and cons of Infinite Page Scrolling for e-commerce Product Categories that have over 700 products. Sample here Secondly how will this effect our on page SEO as far as google concerned? Many Thanks
Web Design | | Jvalops0 -
Seo for flash home page?
I have a client who insists their home page be a single graphic image of their logo with php menus linking to all of the other pages up top. This appears to me to be an seo nightmare. They seem to be unwilling to have anything to do with changing the appearance and want to rely on seo for "all the other pages" on the site. What's an SEO to do in this situation? Is it possible to have a flash image that lands on a "homepage" for google to crawl, rather than a single image? What's the best seo approach here?
Web Design | | peaceland0