Are 17000+ Not Found (404) Pages OK?
-
Very soon, our website will go a rapid change which would result in us removing 95% or more old pages (Right now, our site has around 18000 pages indexed).
It's changing into something different (B2B from B2C) and hence our site design, content etc would change.
Even our blog section would have more than 90% of the content removed.
What would be the ideal scenario be?
- Remove all pages and let those links be 404 pages
- Remove all pages and 301 redirect them to the home page
- Remove all unwanted pages and 301 redirect them to a separate page explaining the change (Although it wouldn't be that relevant since our audience has completely changed)- I doubt it would be ideal since at some point, we'd need ot remove this page as well and again do another redirection
-
Mohit,
Tom's advice will help you determine which pages are worth redirecting and which should just go to a 404 page (which should be customized instead of the browser/host default, and should also return a 404 response code in the http header!). My guess is that pages with links only from scraper sites aren't going to pass the tests laid out by Tom and thus would just go to a 404 page. However, any that have decent external links would fit the criteria and would be candidates for a 301 redirect.
-
Just to add a little to this great reply...
Here is how I would determine if it was worth my time to keep some of the old pages.
If the industry is the same but the end user is different, I would make EVERY attempt to keep those old pages. AuthorRank will matter in the future if you can contribute that information into a particular rel=publisher then I think it will be totally worth the time.
If, however, the information has nothing to do with the industry, then I wouldn't even consider taking the time to figure all of this out. I would have a kick ass 404 page to help people find your new stuff though.
Remember too that when you 301 redirect you do in fact loose some "link juice". (I really hate that phrase) So if the incoming links are of little to now value then a 301 will provide even less.
-
Hi Tom.. Thank you for your advice.
The thing is, we don't want to retain the users. They are not going to serve our cause anymore (We used to spend thousands of dollars every month on server costs just to keep up with teh load. now we are cutting it down- so unwanted users are not really something we want as it would result in load increase)
I'll surely follow your advice on OSE. The thing is, we have lot of link to the pages from scraper sites. I am not sure if it's worth keeping though.
-
Hi there
17,000 is quite a lot. I would look at maybe redirecting some of the URLs and I would do this based on certain criteria.
First of all, it helps to have a complete list of your current URLs. Screaming Frog is a great tool for this and is free.
Once you have your URLs, go into your analytics data and see which pages are attracting users. Take a sample size of about 2-3 months. If you're using Google analytics, click on traffic sources -> sources -> all traffic on the left-hand side.
When the dashboard loads, next to the "Primary Dimension" click other, and from the drop down menu click traffic sources, then landing page.
Any page with more than 5 or 10 visitors could be one worth redirecting. If these are pages that visitors might frequently use to get to your site, ensuring they are redirected might help to not interrupt their user journey. A 404 might put them off and go elsewhere.
Next, I'd look at what pages you might want to save to keep your SEO "strength". Put your URL into OpenSiteExplorer and then once done, click on "top pages". We're interested in the "Inbound Links" column here. Export the file into a CSV then sort the URL list in Excel by the Inbound Link total. You can filter here the pages with less links, so for instance you could remove the pages with 3 inbound links or less. It's a general way of doing things and isn't foolproof, but you will be left with a list of pages that could be getting decent PageRank/link equity. Manually check those pages and their backlinks and if you think they're acceptable, make sure you put in a 301 redirect.
Anything that doesn't match either of these criteria I would leave for a 404. You may be left with a lot, but Google knows that 404s are an accepted part of the course and won't penalise you for them. Check out this webmasters blog link.
Hope this helps with your decision making!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Few pages without SSL
Hi, A website is not fully secured with a SSL certificate.
Intermediate & Advanced SEO | | AdenaSEO
Approx 97% of the pages on the website are secured. A few pages are unfortunately not secured with a SSL certificate, because otherwise some functions on those pages do not work. It's a website where you can play online games. These games do not work with an SSL connection. Is there anything we have to consider or optimize?
Because, for example when we click on the secure lock icon in the browser, the following notice.
Your connection to this site is not fully secured Can this harm the Google ranking? Regards,
Tom1 -
Which is the best option for these pages?
Hi Guys, We have product pages on our site which have duplicate content, the search volume for people searching for these products is very, very small. Also if we add unique content, we could face keyword cannibalisation issues with category/sub-category pages. Now based on proper SEO best practice we should add rel canonical tags from these product pages to the next relevant page. Pros Can rank for product oriented keywords but search volume is very small. Any link equity to these pages passed due to the rel canonical tag would be very small, as these pages barely get any links. Cons Time and effort involved in adding rel canonical tags. Even if we do add rel canonical tags, if Google doesn't deem them relevant then they might ignore causing duplicate content issues. Time and effort involved in making all the content unique - not really worth it - again very minimal searchers. Plus if we do make it unique, then we face keyword cannibalisation issues. -- What do you think would be the optimal solution to this? I'm thinking just implementing a: Across all these product based pages. Keen to hear thoughts? Cheers.
Intermediate & Advanced SEO | | seowork2140 -
How to optimize count of interlinking by increasing Interlinking count of chosen landing pages and decreasing for less important pages within the site?
We have taken out our interlinking counts (Only Internal Links and not Outbound Links) through Google WebMaster tool and discovered that the count of interlinking of our most significant pages are less as compared to of less significant pages. Our objective is to reverse the existing behavior by increasing Interlinking count of important pages and reduce the count for less important pages so that maximum link juice could be transferred to right pages thereby increasing SEO traffic.
Intermediate & Advanced SEO | | vivekrathore0 -
What is the benefit of directory pages?
I recently started at a new job running ecommerce websites. We sell yoga equipment and on 2 of our sites we built directory pages for yoga studios to list their calendars and whatnot. They are pretty old and out of date, but my question is, is there any benefit to these types of directories? If they do, we need to look at refreshing them. But if not, then they need to go. One of them is here. http://www.everythingyoga.com/studios.aspx Like I said, it is out of date.
Intermediate & Advanced SEO | | ShockoeCommerce0 -
How to deal with everscrolling pages?
A website keeps showing more articles when pressing a "load more" button. This loads additional category pages with a page parameter (e.g. ...?page=1, ...?page=2, etc.), as suggested by Google to get all pages indexed. The problem is that this creates thousands of additional, duplicate pages, with a duplicate title, header, and very unfocused content. They also show as duplicate content in Moz. The pages are indexed by Google, but none of them is ranking. What do you guys think: add a no-follow to the load-more button, so search engines will never see them? Thanks for your input!
Intermediate & Advanced SEO | | corusent1 -
Page title inconsistency
Hi folks, Our agency rebranded from New Brand Vision to Decibel Digital a few weeks ago. Most things seem to be fine, 301 redirected the site and our site looks much better however there is one issue. When searching for our responsive site using my Iphone5, the page title appears as "New Brand Vision", even though "New Brand Vision" isn't within the source code. Our page title is <title></span><span data-mce-mark="1">Creative Digital Agency in London | Decibel Digital </span><span class="html-tag" data-mce-mark="1"></title> which is picked up on Desktop, but not through mobile search when sourcing our responsive site. Does anyone have any suggestions? Many thanks!
Intermediate & Advanced SEO | | Tangent0 -
Rel canonical on every page, pointing to home page
I've just started working with a client and have been surprised to find that every page of their site (using Concrete5 CMS) has a rel=canonical pointing to their home page. I'm feeling really dumb, because this seems like a fatal flaw which would keep Google from ranking any page other than the home page... but when I look at Google Analytics, Content > Site Content > Landing Pages, using Secondary Dimension = Source, it seems that Google is delivering users to numerous pages on their site. Can anyone help me out?! Thanks very much!!
Intermediate & Advanced SEO | | measurableROI0 -
Is there an optimal ratio of external links to a page vs internal links originating at that page ?
I understand that multiple links fro a site dilute link juice. I also understand that external links to a specific page with relevant anchortext helps ranking. I wonder if there is an ideal ratioof tgese two items
Intermediate & Advanced SEO | | Apluswhs0