404 page not found after site migration
-
Hi,
A question from our developer.
We have an issue in Google Webmaster Tools.
A few months ago we killed off one of our e-commerce sites and set up another to replace it. The new site uses different software on a different domain. I set up a mass 301 redirect that would redirect any URLs to the new domain, so domain-one.com/product would redirect to domain-two.com/product. As it turns out, the new site doesn’t use the same URLs for products as the old one did, so I deleted the mass 301 redirect.
We’re getting a lot of URLs showing up as 404 not found in Webmaster tools. These URLs used to exist on the old site and be linked to from the old sitemap. Even URLs that are showing up as 404 recently say that they are linked to in the old sitemap. The old sitemap no longer exists and has been returning a 404 error for some time now. Normally I would set up 301 redirects for each one and mark them as fixed, but there are almost quarter of a million URLs that are returning 404 errors, and rising.
I’m sure there are some genuine problems that need sorting out in that list, but I just can’t see them under the mass of errors for pages that have been redirected from the old site. Because of this, I’m reluctant to set up a robots file that disallows all of the 404 URLs.
The old site is no longer in the index. Searching google for site:domain-one.com returns no results.
Ideally, I’d like anything that was linked from the old sitemap to be removed from webmaster tools and for Google to stop attempting to crawl those pages.
Thanks in advance.
-
I agree that the 301 redirect would be your best option as you can pass along not only users but the bots to the right page.. You may need to get a developer in to write some regular expressions to parse the incoming request and then automatically find the correct new URL. I have worked on sites with a large number of pages and using some sort of automation is the only way to go.
That said, if you simply want to kill the old URLs you can show the 404s or 410s. As you mention, then you end up with a bunch of 404 errors in GWT. I have been there too, it's like damned if you do, damned if you don't. We had some URLs that were tracking URLs from an old site and we are now here a year later (been showing 410s for over a year on the old tracking URLs) they still show up in GWT as errors.
We are trying a new solution for how to remove these URLs from the index without getting 404 errors. We show a 200 and then we put up a minimal html page with the meta robots noindex tag.
http://support.google.com/webmasters/bin/answer.py?hl=en&answer=93710
"When we see the noindex meta tag on a page, Google will completely drop the page from our search results, even if other pages link to it. "
So, we allow Google to find the page, get a 200 (so no 404 errors), but then use the meta noindex tag to tell Google to remove it from the index and stop crawling the page.
Remember, this is the "nuclear" option. You only want to do this to remove the pages from the Google index. Someone mentioned using GWT to remove URLs, but if I remember correctly, you only have so many pages you can do this with at a time.
If you list the files within the robots.txt. Google will not spider the files, but then if you remove the page from robots.txt file, they will start to try spidering again. I have seen Google come back a year later on URLs when I take them out of robots. This is what happened to us and so we tried just showing the 410/404, but Google still keeps crawling. We recently moved to this option with the 200/noindexmeta and it seems to be working.
Good luck!
-
You can but the 404s should stop being crawled on their own. There's a webmaster tool that you can use to make that happen faster as well
http://support.google.com/webmasters/bin/answer.py?hl=en&answer=64033
-
Yeah it's a 404 http://www.tester.co.uk/17th-edition-equipment/multifunction-testers/fluke-1651b-multifunction-installation-tester
with over 200,000 404's its a lot to go through and 301. For some reason they it got migrated they just pointed the old url to a new one replacing the root domain name without creating matching url's. Doh.
I was thinking about robot.txt filling them all?
-
A 404 should cause Google to de-index the content. Go to one of the bad URLs and view the headers to make sure that your webserver is returning a status 404 and not just a 404 "page".
As hard and time consuming as it might be, I would still pursue a 301 option. It's the cleanest way to resolve the issue. Just start nibbling at it and you can make a dent. Doing nothing just lets the problem grow.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
A question about title tag when the page has 2 services.
Hi all, Assuming a company has two services: SEO and PPC. Here is the situation: I would like to focus on SEO for now but also don't want to leave my PPC service out of the page. SEO accounts for 60% of the content, while PPC accounts for 40%. Assuming the content (SEO + PPC) of the page will not change, which title tag would you prefer, and why? SEO | brand name (Is it appropriate that the title focus on SEO but the content of the page contains PPC) SEO | PPC | brand name (Will the keywords dilute each other?) SEO | SEM Agency | brand name (The idea behind it is that SEM includes SEO and PPC so I think Google would be OK with the page ranking for SEO and also including PPC in the content. I really appreciate your help and explanation. Thank you!
Web Design | | Raymondlee0 -
Flickr Gallery Effect on Page Ranking
Hello there We are working on a redesign for our site, and our business is very image intensive (sign company) On a typical product page, we have 5 images we are placing directly in the site optimized to try to rank the images in image search We also have about 30-50 sets of images, with 3-5 images each - hosted on flickr, that we are displaying as galleries on the page (user clicks, opens a light box to view the set, etc) Here is the page - http://impactsigns.ugmade.com/sample-page/ If you look at the page code, you will see that the flickr gallery (additional examples) section - adds ALOT of code to the page (lines 498 to 837) My question is : Does adding that flick gallery block negatively impact the page SEO, all else being equal? It seems like a lot of lines of code. And dont want it to seem spammy to the search engines. Thanks for your help and advice
Web Design | | Jumman0 -
Duplicate Content Home Page http Status Code Query
Hi All, We have just redone a site wide url migration (from old url structure to new url structure) and set up our 301's etc but have this one issue whereby I don't know if' it's a problem of not. We have 1 url - www.Domain.co.uk**/** which has been set up to 301 redirect back to www.domain.co.uk However, when I check the server response code, it comes back as 200. So although it appears to visually 301 redirect if I put the url in the tool bar, the status code says different. Could this be seen as a potential duplicate home page potentially and if so , any idea how I could get around it if we can't solve the root cause of it. This is on a cake php framework, thanks PEte
Web Design | | PeteC120 -
How to add SEO Content to this site
Hi Great community and hope you guys can help! I have just started on a SEO project for http://bit.ly/clientsite , the clients required initial KPI is Search Engine Rankings at a fairly low budget. The term I use for the site is a "blurb site", the content is thin and the initial strategy I want to employ to get the keyword rankings is to utilize content. The plan is to: add targeted, quality (user experience & useful) and SEO content on the page itself by adding a "read more" link/button to the "blurb" on the right of the page (see pink text in image) when someone clicks on the "read more", a box of content will slide out styled much the same as the blurb itself and appear next to and/or overlay over the blurb and most of the page (see pink rectangle in image) Question: Is this layer of targeted , quality (user experience & useful) and SEO content (which requires an extra click to get to it) going to get the same SEO power/value as if it were displayed traditionally on the initial display? If not, would it be better to create a second page (2<sup>nd</sup> layer) and have the read more link to that and then rel-canonical the blurb to that 2<sup>nd</sup> page, so that all the SEO passes to this expanded content and the second page/layer is what will show up in the rankings? Thanks in advance qvDgZNE
Web Design | | Torean0 -
Ecommerce Site - SEO
We have a Business Catalyst Site with the Same product Listed in 2 different catalogs. Each product page is the same page with different URLs you can see it here: http://www.yourpharmacy.co.nz/beauty/clarins-skincare/clarins-advanced-extra-firming-eye-contour-cream-20ml http://www.yourpharmacy.co.nz/clarins/clarins-advanced-extra-firming-eye-contour-cream-20ml Any suggestions welcome
Web Design | | OnlineAssetPartners0 -
Landing pages vs internal pages.
Hey everyone I have run into a problem and would greatly appreciate anyone that could weigh in on it. I have a web client that went to an outside vendor for marketing. The client asked me to help them target some keywords and since I am new to the SEO world I have proceeded by researching the best keywords for the client. I found 6 that see excellent monthly searches. I then registered the .com and or .net domain names that match these words. I then started building landing pages that make reference to the keyword and then have links to his site to get more info. My customer sent the first of these sites to the marketer and he says I am doing things all wrong. He says rather then having landing pages like this I should just point the domain names at internal pages to the website. He also says that I should not have different looks for the landing pages from the main site and that I should have the full site menu on each landing page. I wanted to here what everyone here has to say about the pros and cons of the way to do this cause the guy giving the advice to me has a lower ranking site then I do and I have only started working on getting my site ranked this year. He has atleast according to him been doing this forever. Thanks, Ron
Web Design | | bsofttech0 -
Content position on page
I am in a limo service industry where people are not looking for great content or product description, all they want is a nice Lincoln Town car and a competitive price. Because I need to get more pictures in front of my customers rather than more content I am not sure if by not having the content high up in the page will affect my rankings. We are transitioning to a new template where we have more control over the layout of the website but because of the slider that we have on the homepage the content needs to go further down. We could insert some content in each of the slides but the page would start looking too "busy". We want the customers to see very clearly what we offer. They see the picture, click for more info and book the service. How important still is to have your keywords in the first hundred words on a certain webpage? Can we get away with having the content read by search engines after 3 - 4 slides and their description (about 20 words total) ?
Web Design | | echo10 -
How to serve a Mobile & Full Site using one URL?
Hello, Does anyone know of any resources or tutorials that outline how to serve a smartphone-formatted website using the same URL as the full site? I know that one solution is using media-queries to serve a seperate CSS stylesheet, but you still have the full HTML source code. In other words, I might want to serve a smartphone & desktop user different content, but under one URL. WP Touch (Wordpress Plugin) is a perfect example of what I mean, but how is it technically achieved? It serves two different sets of HTML for smartphone & full, but using one URL http://www.bravenewcode.com/store/plugins/wptouch-pro/
Web Design | | petecampbell-bmi0