404 page not found after site migration
-
Hi,
A question from our developer.
We have an issue in Google Webmaster Tools.
A few months ago we killed off one of our e-commerce sites and set up another to replace it. The new site uses different software on a different domain. I set up a mass 301 redirect that would redirect any URLs to the new domain, so domain-one.com/product would redirect to domain-two.com/product. As it turns out, the new site doesn’t use the same URLs for products as the old one did, so I deleted the mass 301 redirect.
We’re getting a lot of URLs showing up as 404 not found in Webmaster tools. These URLs used to exist on the old site and be linked to from the old sitemap. Even URLs that are showing up as 404 recently say that they are linked to in the old sitemap. The old sitemap no longer exists and has been returning a 404 error for some time now. Normally I would set up 301 redirects for each one and mark them as fixed, but there are almost quarter of a million URLs that are returning 404 errors, and rising.
I’m sure there are some genuine problems that need sorting out in that list, but I just can’t see them under the mass of errors for pages that have been redirected from the old site. Because of this, I’m reluctant to set up a robots file that disallows all of the 404 URLs.
The old site is no longer in the index. Searching google for site:domain-one.com returns no results.
Ideally, I’d like anything that was linked from the old sitemap to be removed from webmaster tools and for Google to stop attempting to crawl those pages.
Thanks in advance.
-
I agree that the 301 redirect would be your best option as you can pass along not only users but the bots to the right page.. You may need to get a developer in to write some regular expressions to parse the incoming request and then automatically find the correct new URL. I have worked on sites with a large number of pages and using some sort of automation is the only way to go.
That said, if you simply want to kill the old URLs you can show the 404s or 410s. As you mention, then you end up with a bunch of 404 errors in GWT. I have been there too, it's like damned if you do, damned if you don't. We had some URLs that were tracking URLs from an old site and we are now here a year later (been showing 410s for over a year on the old tracking URLs) they still show up in GWT as errors.
We are trying a new solution for how to remove these URLs from the index without getting 404 errors. We show a 200 and then we put up a minimal html page with the meta robots noindex tag.
http://support.google.com/webmasters/bin/answer.py?hl=en&answer=93710
"When we see the noindex meta tag on a page, Google will completely drop the page from our search results, even if other pages link to it. "
So, we allow Google to find the page, get a 200 (so no 404 errors), but then use the meta noindex tag to tell Google to remove it from the index and stop crawling the page.
Remember, this is the "nuclear" option. You only want to do this to remove the pages from the Google index. Someone mentioned using GWT to remove URLs, but if I remember correctly, you only have so many pages you can do this with at a time.
If you list the files within the robots.txt. Google will not spider the files, but then if you remove the page from robots.txt file, they will start to try spidering again. I have seen Google come back a year later on URLs when I take them out of robots. This is what happened to us and so we tried just showing the 410/404, but Google still keeps crawling. We recently moved to this option with the 200/noindexmeta and it seems to be working.
Good luck!
-
You can but the 404s should stop being crawled on their own. There's a webmaster tool that you can use to make that happen faster as well
http://support.google.com/webmasters/bin/answer.py?hl=en&answer=64033
-
Yeah it's a 404 http://www.tester.co.uk/17th-edition-equipment/multifunction-testers/fluke-1651b-multifunction-installation-tester
with over 200,000 404's its a lot to go through and 301. For some reason they it got migrated they just pointed the old url to a new one replacing the root domain name without creating matching url's. Doh.
I was thinking about robot.txt filling them all?
-
A 404 should cause Google to de-index the content. Go to one of the bad URLs and view the headers to make sure that your webserver is returning a status 404 and not just a 404 "page".
As hard and time consuming as it might be, I would still pursue a 301 option. It's the cleanest way to resolve the issue. Just start nibbling at it and you can make a dent. Doing nothing just lets the problem grow.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Index Page Redirect to Home Page? Best Practices...
Hi, I am wondering what the best practice is when a site has an index page and a home page? I have two pages, listed below, and want to know if I should 301 redirect my "index" page to my standard home page. The home page is where I would like all traffic to fall on for our website. Additionally, I used the rel=canonical tag years ago on the index page to indicate that the home page is the main content. Home Page - https://www.1099pro.com/ (PA 45) Home Page Canonical: rel="canonical" href="https://www.1099pro.com/"/> Index Page - https://www.1099pro.com/index.asp (PA - 33) Index Page Canonical: rel="canonical" href="https://www.1099pro.com/"/> It seems to me that there is some extra juice that could be passed to my home page (which is the page that ranks highly for our major keywords) by 301 redirecting the index page. Is there any reason why I should not do that? Really appreciate any help - especially with extra explanations - for the simple minded like me ;)! -Michael
Web Design | | Stew2220 -
Moving pages to new domain
Hello, Our product pages are ranked #1 on google for our target keywords using our domain e.g. www.olddomain.com/cases/productxyz and sell about 20 products all ranked #1. We have a new company called www.newco.com/case/product1, 2, 3 etc. We use woocommerce e-commerce for both old and new sites. What is the best way to list our old co-products on our new site and move over the #1 rankings? Do we create new products (using our new nice design) in the newco.com woo commerce and then redirect old co links? do we copy and paste all that old content into the newco.com? Totally confused. Thank you!
Web Design | | Jamesmcd031 -
Lots of Listing Pages with Thin Content on Real Estate Web Site-Best to Set them to No-Index?
Greetings Moz Community: As a commercial real estate broker in Manhattan I run a web site with over 600 pages. Basically the pages are organized in the following categories: 1. Neighborhoods (Example:http://www.nyc-officespace-leader.com/neighborhoods/midtown-manhattan) 25 PAGES Low bounce rate 2. Types of Space (Example:http://www.nyc-officespace-leader.com/commercial-space/loft-space)
Web Design | | Kingalan1
15 PAGES Low bounce rate. 3. Blog (Example:http://www.nyc-officespace-leader.com/blog/how-long-does-leasing-process-take
30 PAGES Medium/high bounce rate 4. Services (Example:http://www.nyc-officespace-leader.com/brokerage-services/relocate-to-new-office-space) High bounce rate
3 PAGES 5. About Us (Example:http://www.nyc-officespace-leader.com/about-us/what-we-do
4 PAGES High bounce rate 6. Listings (Example:http://www.nyc-officespace-leader.com/listings/305-fifth-avenue-office-suite-1340sf)
300 PAGES High bounce rate (65%), thin content 7. Buildings (Example:http://www.nyc-officespace-leader.com/928-broadway
300 PAGES Very high bounce rate (exceeding 75%) Most of the listing pages do not have more than 100 words. My SEO firm is advising me to set them "No-Index, Follow". They believe the thin content could be hurting me. Is this an acceptable strategy? I am concerned that when Google detects 300 pages set to "No-Follow" they could interpret this as the site seeking to hide something and penalize us. Also, the building pages have a low click thru rate. Would it make sense to set them to "No-Follow" as well? Basically, would it increase authority in Google's eyes if we set pages that have thin content and/or low click thru rates to "No-Follow"? Any harm in doing this for about half the pages on the site? I might add that while I don't suffer from any manual penalty volume has gone down substantially in the last month. We upgraded the site in early June and somehow 175 pages were submitted to Google that should not have been indexed. A removal request has been made for those pages. Prior to that we were hit by Panda in April 2012 with search volume dropping from about 7,000 per month to 3,000 per month. Volume had increased back to 4,500 by April this year only to start tanking again. It was down to 3,600 in June. About 30 toxic links were removed in late April and a disavow file was submitted with Google in late April for removal of links from 80 toxic domains. Thanks in advance for your responses!! Alan0 -
Attachment Pages
i have hundreds/thousands of images on my site, but for some reason the images on this page - http://indigocarhire.co.uk/top-of-the-range-car-hire/ - are being flagged as attachment pages, meaning im getting errors for duplicate titles, missing metas ect why are these images and only these ones being flagged up, they have been added in exactly the same way as every other image on the site appreciate any advice Thanks
Web Design | | RGOnline0 -
Post vs Pages
Does Google make any distinction between a web page and a blog post? Assuming all else is equal...any reason why a page would rank higher than a post? And that includes a page in WordPress vs a WordPress blog post.
Web Design | | Pinlaser1 -
Will launching this site get my E-commerce site penalized?
Hello.. I am wondering if you guys think launching a site like this is a good or a bad idea. All of the links on it go directly to the exact corresponding page on the ecommerce site. Do you think Google will penalize my site for launching sites (i have many other domains that i will be setting up similar to this) like this? Thanks...
Web Design | | Prime850 -
Two sites in same industry and which shopping cart
Right. So I suspect I am going to sound paranoid here - but you'll all forgive me right?? I am sure I saw a reply to a question on the Q&A suggesting that it was a bad idea to have two sites in one industry as Google may see it as trying to get two bites of the SERP cherry... is this accurate? I have an existing asp.net site in the maternity wear industry here in Australia and am wanting to start another site to appeal to a different customer base... the market is quite broad. There will be a core list of products that are the same between the sites, but also some quite different products. Content, product descriptions and categorys will be different. I have another website that I bought with reasonable age and links in the industry that I was going to 301 to the new site to give it a kick in the juice. So, not wanting to deceive my customers in anyway, I was thinking I would call it a "division of" or "sister site to" the existing ecommerce site, with a single link back and forward between the two sites. Would there be anything wrong with this in googles eyes? Even with same contact details? They would be run on totally different platforms and hosted by totally different providers. Or would you keep them totally seperate and only have contact details in images? Or a step further and have totally different phone numbers etc? Then the shopping cart - I would love some suggestions on which opensourse cart to use, preferrably one that I can set up myself, and that has a good framework for seo. I want to use schema.org, authorship, seo friendly urls all of which I am having trouble getting out of the developer of my asp.net site.... I don't want the new site to be asp.net Thanks in advance!!
Web Design | | catfree0 -
Duplicate content on mobile sites
Hi Guys We are launching a mobile webshop later this year and have decided to use a subdomain for this. (m.domainname.xx). The content will be more or less identical with the one on the standard desktop site (domainname.xx), but im struggeling to find out if this will create dipplicate content between the mobile and desktop site. Does anyone have a solid answer for this one?
Web Design | | AndersDK0