What to do about old urls that don't logically 301 redirect to current site?
-
Mozzers,
I have changed my site url structure several times.
As a result, I now have a lot of old URLs that don't really logically redirect to anything in the current site.
I started out 404-ing them, but it seemed like Google was penalizing my crawl rate AND it wasn't removing them from the index after being crawled several times. There are way too many (>100k) to use the URL removal tool even at a directory level.
So instead I took some advice and changed them to 200, but with a "noindex" meta tag and set them to not render any content. I get less errors but I now have a lot of pages that do this.
Should I (a) just 404 them and wait for Google to remove (b) keep the 200, noindex or (c) are there other things I can do? 410 maybe?
Thanks!
-
"So instead I took some advice and changed them to 200, but with a "noindex" meta tag and set them to not render any content. I get less errors but I now have a lot of pages that do this."
I would not recommend keeping it that way. You could mass redirect them to the sitemap page if they are passing PR and or some traffic, and there is no logical other place to point them.
404's are not really something that can hurt you, providing that they are coming from external sources and you aren't providing 404 links on your site to dead pages on your site, if there are these, then you should fix the internal links at the source.
-
I dont think 404 errors hurt your site. If you have that many pages, they are most likely crawling your site a lot anyway. Have you set your crawl frequency in your sitemap? On bigger sites that get frequent updates, we set the crawl frequency to daily rather than weekly.
If possible, try to see if there are any top level items you can submit a URL removal request for. Hopefully this can speed up the process fo getting the URL's removed. This process can take a long time for Google to take care of. After changing websites we still had 404 errors after 6 months, even after submitting the URL removal request.
Another option is to have the page render a 410 rather than a 404. A 410 states to the search engine the page is gone, and will not be coming back. If you are using some form of cart system or cms there might be a way to apply the code to a large number of pages at once, rather than trying to manually code 100k pages.
"410 Gone
The requested resource is no longer available at the server and no forwarding address is known. This condition is expected to be considered permanent. Clients with link editing capabilities SHOULD delete references to the Request-URI after user approval. If the server does not know–or has no facility to determine–whether or not the condition is permanent, the status code 404 (Not Found) should be used instead of 410 (Gone). This response is cacheable unless indicated otherwise."Worse case scenero, you could set them to no-index, or just leave them be. Even if they dont lead anywhere logically, they could still bring you traffic. Or redirect them to the closest thing that is on the site currently.
-
JC,
When you say ...started out 404-ing them...seemed like Google was penalizing my crawl rate..... etc. I have not seen where Google even algorithmically had any real issues with 404's. I your site has 500K pages and 100K are 404'd I do not think it would be a problem for Google per se. (You might have a searcher problem if these were pages that were bookmarked, lots of links, etc.) My caution would be that if you have a lot of pages on the site with links that still go to the 404 pages you could run into UX issues.
For me, I would go with the 404's. I think they will get removed over time.Best
-
When necessary, redirect relevant pages to closely related URLs. Category pages are better than a general homepage.
If the page is no longer relevant, receives little traffic, and a better page does not exist, it’s often perfectly okay to serve a 404 or 410 status codes.
-
You could redirect them to something even remotely relevant even if its the homepage at the end of the day. What ever you do it going to take time and it's going to give you some sort of headache.
What would best suit a user who might land on an old link or somehow get to the page? That would be the best way to find a solution. A good soft 404 or redirect tends to help here.
Best of luck though.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
I need help on how best to do a complicated site migration. Replacing certain pages with all new content and tools, and keeping the same URL's. The rest just need to disappear safely. Somehow.
I'm completely rebranding a website but keeping the same domain. All content will be replaced and it will use a different theme and mostly new plugins. I've been building the new site as a different site in Dev mode on WPEngine. This means it currently has a made-up domain that needs to replace the current site. I know I need to somehow redirect the content from the old version of the site. But I'm never going to use that content again. (I could transfer it to be a Dev site for the current domain and automatically replace it with the click of a button - just as another option.) What's the best way to replace blahblah.com with a completely new blahblah.com if I'm not using any of the old content? There are only about 4 URL'st, such as blahblah.com/contact hat will remain the same - with all content replaced. There are about 100 URL's that will no longer be in use or have any part of them ever used again. Can this be done safely?
Intermediate & Advanced SEO | | brickbatmove1 -
410 or 301 after URL update?
Hi there, A site i'm working on atm has a thousand "not found" errors on google console (of course, I'm sure there are thousands more it's not showing us!). The issue is a lot of them seem to come from a URL change. Damage has been done, the URLs have been changed and I can't stop that... but as you can imagine, i'm keen to fix as many as humanly possible. I don't want to go mad with 301s - but for external links in, this seems like the best solution? On the other hand, Google is reading internal links that simply aren't there anymore. Is it better to hunt down the new page and 301-it anyway? OR should I 410 and grit my teeth while google crawls and recrawls it, warning me that this page really doesn't exist? Essentially I guess I'm asking, how many 301s are too many and will affect our DA? And what's the best solution for dealing with mass 404 errors - many of which aren't attached or linked to from any other pages anymore? Thanks for any insights 🙂
Intermediate & Advanced SEO | | Fubra0 -
How necessary is it to disavow links in 2017? Doesn't Google's algorithm take care of determining what it will count or not?
Hi All, So this is a obvious question now. We can see sudden fall or rise of rankings; heavy fluctuations. New backlinks are contributing enough. Google claims it'll take care of any low quality backlinks without passing pagerank to website. Other end we can many scenarios where websites improved ranking and out of penalty using disavow tool. Google's statement and Disavow tool, both are opposite concepts. So when some unknown low quality backlinks are pointing and been increasing to a website? What's the ideal measure to be taken?
Intermediate & Advanced SEO | | vtmoz0 -
Google don't index .ee version of a website
Hello, We have a problem with our clients website .ee. This website was developed by another company and now we don't know what is wrong with it. If i do a Google search "site:.ee" it only finds konelux.ee homepage and nothing else. Also homepage title tag and meta dec is in Finnish language not in Estonian language. If i look at .ee/robots.txt it looks like robots.txt don't block Google access. Any ideas what can be wrong here? BR, T
Intermediate & Advanced SEO | | sfinance0 -
301 redirect to a temporary URL
Hi there, What would happen if I redirected a set of URLs to a temporary URL structure. And then a few weeks later redirected the original URLs and temporary URLs to the final permanent URLs? So for example:A -> B for a few weeks.
Intermediate & Advanced SEO | | sichristie
then: A->C and B->C where:
C is the final destination URL.
B is the temporary destination
A is the original URL. The reason we are doing this is the naming of the URLs and pages are different, and we wish to transition our customers carefully from old to new. I am looking for a pure technical response.
Would we lose link juice? Does Google care if we permanently redirect to a set of 'temporary' URLs, and then permanently redirect to a set of what we think are permanent URLs? Cheers, Simon0 -
Aged domain and 301 redirect? (11 year old domain)
Hey everyone, I'm about to launch a new website for an accounting firm. They currently have a website, which has an 11 year old domain. They are doing very well locally for SEO, and i'm guessing it's because of the aged domain, as their website is very badly built, and contains almost no content. They would like to launch the new site with a simpler, easier to remember domain. If i launch the new site, point the aged domain using a 301 redirect, and do redirects for all of the old pages to the newer versions of them, is there a chance the company will lose their current SEO rankings? Thanks!
Intermediate & Advanced SEO | | RCDesign740 -
Robots.txt file - How to block thosands of pages when you don't have a folder path
Hello.
Intermediate & Advanced SEO | | Unity
Just wondering if anyone has come across this and can tell me if it worked or not. Goal:
To block review pages Challenge:
The URLs aren't constructed using folders, they look like this:
www.website.com/default.aspx?z=review&PG1234
www.website.com/default.aspx?z=review&PG1235
www.website.com/default.aspx?z=review&PG1236 So the first part of the URL is the same (i.e. /default.aspx?z=review) and the unique part comes immediately after - so not as a folder. Looking at Google recommendations they show examples for ways to block 'folder directories' and 'individual pages' only. Question:
If I add the following to the Robots.txt file will it block all review pages? User-agent: *
Disallow: /default.aspx?z=review Much thanks,
Davinia0 -
Why isnt my crawl results showing a 301 redirect even though I have a 301 rewrite in my .htaccess file?
Ive searched the previous Q&A's & cant find an answer so I;ll ask it here 🙂 crawling my site shows isnt the 301 redirect that i have from my non www to my www domainIts only showing all the results for my www subdomain.As i'm new to SEO & SeoMoz I dont fully understand. Any help would be greatly appreciated because my site is like 2 & a half years old & i'm trying to learn seo so I can rank higher in the serp's. Thanks
Intermediate & Advanced SEO | | PCTechGuy20120