404 page not found after site migration
-
Hi,
A question from our developer.
We have an issue in Google Webmaster Tools.
A few months ago we killed off one of our e-commerce sites and set up another to replace it. The new site uses different software on a different domain. I set up a mass 301 redirect that would redirect any URLs to the new domain, so domain-one.com/product would redirect to domain-two.com/product. As it turns out, the new site doesn’t use the same URLs for products as the old one did, so I deleted the mass 301 redirect.
We’re getting a lot of URLs showing up as 404 not found in Webmaster tools. These URLs used to exist on the old site and be linked to from the old sitemap. Even URLs that are showing up as 404 recently say that they are linked to in the old sitemap. The old sitemap no longer exists and has been returning a 404 error for some time now. Normally I would set up 301 redirects for each one and mark them as fixed, but there are almost quarter of a million URLs that are returning 404 errors, and rising.
I’m sure there are some genuine problems that need sorting out in that list, but I just can’t see them under the mass of errors for pages that have been redirected from the old site. Because of this, I’m reluctant to set up a robots file that disallows all of the 404 URLs.
The old site is no longer in the index. Searching google for site:domain-one.com returns no results.
Ideally, I’d like anything that was linked from the old sitemap to be removed from webmaster tools and for Google to stop attempting to crawl those pages.
Thanks in advance.
-
I agree that the 301 redirect would be your best option as you can pass along not only users but the bots to the right page.. You may need to get a developer in to write some regular expressions to parse the incoming request and then automatically find the correct new URL. I have worked on sites with a large number of pages and using some sort of automation is the only way to go.
That said, if you simply want to kill the old URLs you can show the 404s or 410s. As you mention, then you end up with a bunch of 404 errors in GWT. I have been there too, it's like damned if you do, damned if you don't. We had some URLs that were tracking URLs from an old site and we are now here a year later (been showing 410s for over a year on the old tracking URLs) they still show up in GWT as errors.
We are trying a new solution for how to remove these URLs from the index without getting 404 errors. We show a 200 and then we put up a minimal html page with the meta robots noindex tag.
http://support.google.com/webmasters/bin/answer.py?hl=en&answer=93710
"When we see the noindex meta tag on a page, Google will completely drop the page from our search results, even if other pages link to it. "
So, we allow Google to find the page, get a 200 (so no 404 errors), but then use the meta noindex tag to tell Google to remove it from the index and stop crawling the page.
Remember, this is the "nuclear" option. You only want to do this to remove the pages from the Google index. Someone mentioned using GWT to remove URLs, but if I remember correctly, you only have so many pages you can do this with at a time.
If you list the files within the robots.txt. Google will not spider the files, but then if you remove the page from robots.txt file, they will start to try spidering again. I have seen Google come back a year later on URLs when I take them out of robots. This is what happened to us and so we tried just showing the 410/404, but Google still keeps crawling. We recently moved to this option with the 200/noindexmeta and it seems to be working.
Good luck!
-
You can but the 404s should stop being crawled on their own. There's a webmaster tool that you can use to make that happen faster as well
http://support.google.com/webmasters/bin/answer.py?hl=en&answer=64033
-
Yeah it's a 404 http://www.tester.co.uk/17th-edition-equipment/multifunction-testers/fluke-1651b-multifunction-installation-tester
with over 200,000 404's its a lot to go through and 301. For some reason they it got migrated they just pointed the old url to a new one replacing the root domain name without creating matching url's. Doh.
I was thinking about robot.txt filling them all?
-
A 404 should cause Google to de-index the content. Go to one of the bad URLs and view the headers to make sure that your webserver is returning a status 404 and not just a 404 "page".
As hard and time consuming as it might be, I would still pursue a 301 option. It's the cleanest way to resolve the issue. Just start nibbling at it and you can make a dent. Doing nothing just lets the problem grow.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Adding picture and new layout on jobs-overview page
Im running a castingsite today, where the jobs-overview page is the highest ranked on google on the important words. There is a big of reasons for that, it's updated daily, the domain is old and wellknown and so. Anyways, the today is this: (Yes it's ugly and old-school :))
Web Design | | KasperGJ
Current design:
http://www.onlinecasting.dk/auditions.asp I've created a new design, which is much nicer and with added pictures. The pictures in the new design, will be somewhat unique to the specific jobs, so the current ones are mostly for testing New design: (Not implemented)
http://www.onlinecasting.dk/auditionsnd.asp Question:
So my question is. Do you think this NEW design could affect my site / page in a bad way in SEO or?
I'm planning basically just to overwrite the old auditions.asp file with the new code. What do you guys think.0 -
My 404 page is showing a 4xx error. How can that be fixed?
My actual 404 page is giving a 4xx error.
Web Design | | sbetzen
The page address is http://www.ecowindchimes.com/v/404.asp It loads fine... it is the page all 404's are directed to. Why is it showing a 404 error. The page works. How can this be fixed? Stephen0 -
How does a Responsive Site kill SEO?
How does a Responsive Site poentially kill SEO? I've seen a few feeds on twitter how a website took a rankings dive after implementing a Responsive theme; yet, it's not clear to me what is actually going on within a Responsive site that would cause the SEO rank to tank? I can only speculate that it introduces a bunch of 404 errors, or that it changes all of the URLs into gibberish, so you loose all of the links coming into your website if not 301'ed? Can someone clarify, what are the actual mechanical issues on a Responsive website that becomes a concern to SEO? Thanks.
Web Design | | ExploreConsulting1 -
Solutions for too many links on page (Ecommerce)?
Hello Mozzers, Most Ecommerce websites I've come across have four main link sections - Main Nav - About, Contact etc Side Nav - List of Categories + Products Footer - Useful links etc Promotional Area - Promoting Best sellers / Latest products This ends up totalling anything from 200 to 500 links. I was wondering is there a reasonable solution to hide some of the links? Or should I just ignore the warning? Thanks, Dan
Web Design | | Sparkstone0 -
How much content is too much? Best Pages For Content?
To my understanding content has a lot to do with organic rankings if written correctly. My question is, how much content is too much and what pages are best to place content. Our company sells very costly products. Our customers call to purchase, we do not have an eCommerce site. Write now we have on average 350 words per page. We have about 200+ pages. Each page is written for that general category and each product has its own unique content. It seems to me that the pages with less content, tend to rank a bit better. As we are in the process of redoing our website, is there any recommendations on writing content, or adjusting the amount of text. I am thinking a lot of our text is informative only to a certain extent. Would writing content just for the main category page be better, and then on the actual product page, have only about 250 words as a description? Are there any other recommendations for SEO that are fairly new? Besides the Title, Description, Heading Tags, Image Alts, URLS etc.
Web Design | | hfranz0 -
Page Title or Search Friendly Urls?
We are currently auditing our website as part of our SEO strategy. One item which hascome up is the importance of search friendly urls against the search engine friendly page titles. Do url's or page titles carry more relevance than the other in search engines? Obviously the ideal would be to have both to maximise search impact but do either carry more importance. Thanks
Web Design | | bwfc770 -
Setup of three major retail sites.. need advice.
I recently have taken a new position responsible for three large national retail sites which are all owned by one parent organization. Through a series of acquisitions, these three major brands have been brought under one umbrella and a brand consolidation is likely not to happen within the next 2-4 years. I have a number of questions I’m hoping to get some feedback on, but first a little more background is necessary. A year ago (before my time) the three sites were over-hauled, but were designed to use one common custom CMS and all of the navigation and nearly all the content is the same (with some exceptions, such as tags, url, etc.). All of the brands have identical products and services; however, each one services a different demographic in the US. The design was intended for ease of management, but is terrible for seo. Additionally, without the geographic reference, they all compete for the same keywords. They have now begun a very large ecommerce project utilizing an ATG platform. The initial direction is to use one platform for all three brands, but keep them on separate domains and with the use of basic switching, replace nominal content such as logos and references of the brands for each of the domains. I’m concerned with this approach and would like to hear your feedback.. When optimizing a page for one keyword set, are they likely to be filtered due to dup content? The argument that management has is that all three current sites rank very well for one keyword on all three sites. They feel it won’t be an issue due to this. One option, that is currently still available, is to tri-band one ecommerce site, but it would have to be on an entirely new domain. The other three domains are very well established and are PR6s. Management, and even I, is afraid to abandon these other domains, but having a single domain would allow us to have unique content and really leverage all efforts to one domain. Thoughts? Any knowledge or thoughts what kind of impact having three domains on one ATG platform will be? Thanks much! John If you feel it will help, please message me and I can share the urls... Also, how would you handle a company blog in this case?
Web Design | | kavaliauskas0 -
Are slimmed down mobile versions of a canonical page considered cloaking?
We are developing our mobile site right now and we are using a user agent sniffer to figure out what kind of device the visitor is using. Once the server knows whether it is a desktop or mobile browser it will deliver the appropriate template. We decided to use the same URL for both versions of the page rather than using m.websiteurl.com or www.websiteurl.mobi so that traffic to either version of these pages would register as a visit to the page. Will search engines consider this cloaking or is mobile "versioning" an acceptable practice? The pages in essence are the same, the mobile version will just leave out extraneous scripts and unnecessary resources to better display on a mobile device.
Web Design | | TahoeMountain400