Reducing pages with canonical & redirects
-
We have a site that has a ridiculous number of pages. Its a directory of service providers that is organized by city and sub-category of the vertical. Each provider is on the main city page, then when you click on a category, it will only show those folks who offer that subcategory of this service.
example:
- colorado/denver - main city page
- colorado/denver/subcat1 - subcategory page
There are 37 subcategories. So, 38 pages that essentially have the same content - minus a provider or two - for each city.
There are approx 40K locations in our database. So rough math puts us at 1.5 million results pages, with 97% of those pages being duplicate content!
This is clearly a problem. But many of these obscure pages do rank and get traffic. A fair amount when you aggregate all these pages together.
We are about to go through a redesign and want to consolidate pages so we can reduce the dupe content, get crawl budget allocated to more meaningful pages, etc.
Here's what I'm thinking we should do with this site, and I would love to have your input:
- Canonicalize
Before the redesign use the canonical tag on all the sub-category pages and push all the value from those pages (colorado/denver/subcat1, /subcat2, /subcat3... etc) to the main city page (colorado/denver/subcat1)
- 301 Redirect
On the new site (we're moving to a new CMS) we don't publish the duplicate sub-category pages and do 301 redirects from the sub-category URLs to the main city page urls.
We'd still have the sub-categories (keywords) on-page and use some Javascript filtering to narrow results.
We could cut to the chase and just do the redirects, but would like to use canonicalization as a proof of concept internally at my company that getting rid of these pages is a good thing, or at least wont have a negative impact on traffic. i.e. by the time we are ready to relaunch traffic and value has been transfered to the /state/city page
Trying to create the right plan and build my argument. Any feedback you have will help.
-
Hi! We're going through some of the older unanswered questions and seeing if people still have questions or if they've gone ahead and implemented something and have any lessons to share with us. Can you give an update, or mark your question as answered?
Thanks!
-
The best way is to make sure you're using the tag properly and that you have all your angles covered.
There is actually some good posts on SEOmoz about canonicalization, I'll try and find those for you.
-
awesome feedback! thanks david. would like to hear your thoughts on proper canonicalization when you have a moment. thanks again.
-
Your plan sounds good but here are a few things I'd like to add.
-
Make sure the dupe pages you're getting rid of are not the main traffic sources. If that is the case you'll want to redirect only a few at a time and slowly go around fixing that. You don't want to switch to new CMS, throw up redirects, and lose 85% of your traffic. Just make sure it's not your main traffic source.
-
Make sure you use the proper methods of canonicalization. Don't half-ass it.
-
On the new site, because you have a large and deep site, make sure you have a proper sitemap generated fresh all the time and that the proper weights are assigned and proper structuring. Less levels = better.
-
Watch your Webmaster Tools.
That is all I have, I think you'll be fine.
-
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Taken a canonical off a page to let it rank with new unique content - what more can I do?
A week ago, I took a canonical off of a page that was pointing to the homepage for a very big, generic search term for my brand as we felt that it could have been harming our rankings (as it wasn't a true canonical page). A week in and our rankings for the term have dropped 7 positions out of page 1 and the page we want to rank instead is nowhere to be seen. Do I hang fire? As such a big search term, it's affecting traffic, but I don't want to make any rash decisions. Here's a bit more info: For arguments sake, let's call the search term we're going after 'Boots', with the URL where the canonical was placed of /boots. The canonical went to the root domain as we sell, well... boots. At the time, the homepage was ranking for Boots on page 1 and we wanted to change this so that the Boots page ranked for that term... all logical right? We did the following: Took off mentions of Boots from meta on the homepage and made sure it was optimised for on the boots page. Took the canonical off of /boots. Used GSC to fetch & ask Google to recrawl "/boots". Resubmitted the sitemap. Do I hang fire on running back to the safety of ranking for boots on the homepage? Do I risk keyword cannibalisation by adding the search terms back to the homepage?
Intermediate & Advanced SEO | | Kelly_Edwards0 -
Soft 404 error for a big, longstanding 301-redirected page
Hi everyone, Years ago, we acquired a website that had essentially 2 prominent homepages - one was like example.com and the other like example.com/htm... They served the same purpose basically, and were both very powerful, like PR7 and often had double listings for important search phrases in Google. Both pages had amassed considerable powerful links to them. About 4 years ago, we decided to 301 redirect the example.com/htm page to our homepage to clean up the user experience on our site and also, we hoped, to make one even stronger page in serps, rather than two less strong pages. Suddenly, in the past couple weeks, this example.com/htm 301-ed page started appearing in our Google Search Console as a soft 404 error. We've never had a soft 404 error before now. I tried marking this as resolved, to see if the error would return or if it was just some kind of temporary blip. The error did return. So my questions are:
Intermediate & Advanced SEO | | Eric_R
1. Why would this be happening after all this time?
2. Is this soft 404 error a signal from Google that we are no longer getting any benefit from link juice funneled to our existing homepage through the example.com/htm 301 redirect? The example.com/htm page still has considerable (albeit old) links pointing to it across the web. We're trying to make sense of this soft 404 observation and any insight would be greatly appreciated. Thanks!
Eric0 -
Ecommerce Site - Duplicate product descriptions & SKU pages
Hi I have a couple of questions regarding the best way to optimise SKU pages on a large ecommerce site. At the moment we have 2 landing pages per product - one is the primary landing page with no SKU, the other includes the SKU in the URL so our sales people & customers can find it when using the search facility on the site. The SKU landing page has a canonical pointing to the primary page as they're duplicates. Is this the best way? Or is it better to have the one page with the SKU in the URL? Also, we have loads of products with the very similar product descriptions, I am working on trying to include a unique paragraph or few sentences on these to improve the content - how dangerous is the duplicate content within your own site? I know its best to have totally unique content, but it won't be possible on a site with thousands of products and a small team. At the moment I am trying to prioritise the products to update. Thank you 🙂
Intermediate & Advanced SEO | | BeckyKey0 -
Consistent Ranking Jumps Page 1 to Page 5 for months - help needed
Hi guys and gals, I have a really tricky client who I just can't seem to gain consistency with in their SERP results. The keywords are competitive but what the main issue I have is the big page jumps that happen pretty much on a weekly basis. We go up and down 40 positions and this behaviour has been going on for nearly 6 months.
Intermediate & Advanced SEO | | Jon_bangonline
I felt it would resolve itself in time but it has not. The website is a large ecommerce website. Their link profile is OK in regards to several high quality newspaper publication links, majority brand related anchor texts and the link building we have engaged in has all been very good i.e. content relevant / high quality places. See below for some potential causes I think could be the reason: The on page SEO is good however the way their ecommerce website is setup they have formed a substantial amount of duplicate title tags. So in my opinion this is a potential cause. The previous web developer set-up 301 redirects all to their home page for any 404 errors. I know best practice is to go to the most relevant pages, however could this be a potential issue? We had some server connectivity issues show up in webmasters tools but that was for 1 day about 4 months ago. Since then no issues. they have quite a few 'blocked URLs' in their robots.txt file, e.g. Disallow: /login, Disallow: /checkout/ but to me these seem normal and not a big issue. We have seen a decrease over the last 12 months in Webmasters Tools of 'total indexed web pages' from 5000 to 2000 which is quite an odd statistic. Summary So all in all I am a tad stumped. We have some duplicate content issues in title tags, perhaps not following best practice in the 301 redirects but other than that I dont see any major on page issues, unless I am missing something in the seriousness of what I have listed.
Finally we have also do a bit of a cull of poor quality links, requesting links to be removed and also submitting a 'disavow' of some really bad links. We do not have a manual penalty though. Thoughts, feedback or comments VERY welcome.0 -
I have removed over 2000+ pages but Google still says i have 3000+ pages indexed
Good Afternoon, I run a office equipment website called top4office.co.uk. My predecessor decided that he would make an exact copy of the content on our existing site top4office.com and place it on the top4office.co.uk domain which included over 2k of thin pages. Since coming in i have hired a copywriter who has rewritten all the important content and I have removed over 2k pages of thin pages. I have set up 301's and blocked the thin pages using robots.txt and then used Google's removal tool to remove the pages from the index which was successfully done. But, although they were removed and can now longer be found in Google, when i use site:top4office.co.uk i still have over 3k of indexed pages (Originally i had 3700). Does anyone have any ideas why this is happening and more importantly how i can fix it? Our ranking on this site is woeful in comparison to what it was in 2011. I have a deadline and was wondering how quickly, in your opinion, do you think all these changes will impact my SERPs rankings? Look forward to your responses!
Intermediate & Advanced SEO | | apogeecorp0 -
What happens when I redirect an entire site to an established page on another site?
Hi There, I have a website which is dedicated to selling ONE product (in different forms) or my main brand site. It is branded similarly, targets similar keywords, and gets some traffic which convert to leads. Additionally, the auxiliary site has a Google Rank 2 in its own right. I am thinking of consolidating this "auxillary" site to the specific product page on my main site. The reason I am considering doing this is to give a "boost" to the main product page on our main site which has many core keywords sitting with SERP ranking of between 11-20 (so not in first 10) Because this auxiliary site it gets traffic and leads in its own right, I don't want this to be to the detriment of my leads overall. Question is - if I 301 redirect the entire domain from my auxillary site to the equivalent product on my main site am I likely to see a large "boost" to that product page? (i.e. will I likely see my ranking rise from 11 - 20 significantly)
Intermediate & Advanced SEO | | love-seo-goodness0 -
Confusing 301 / Canonical Redirect Issue - Wizard Needed
I had two pages on my site with identical content. What I did was 301 redirect one page to the other. I also added canonical redirect code to the page that held the 301 code. Here is what I have: www.careersinmusic.com/music-colleges.aspx - this page was a duplicate and I needed it to resolve to:
Intermediate & Advanced SEO | | 4Buck
www.careersinmusic.com/music-schools.aspx Here is the code I used: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX music-colleges.aspx
<%@ Page Language="VB" AutoEventWireup="false" CodeFile="music-colleges.aspx.vb" Inherits="music_colleges" %>
http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
http://www.w3.org/1999/xhtml"> http://www.careersinmusic.com/music-schools.aspx"/> XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
music-colleges.aspx.vb
Partial Class music_colleges
Inherits System.Web.UI.Page
Protected Sub Page_Load(ByVal sender As Object, ByVal e As System.EventArgs) Handles Me.Load
Response.Status = "301 Moved Permanently"
Response.AddHeader("Location", "http://www.careersinmusic.com/music-schools.aspx")
End Sub
End Class XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX The problem:
For some reason, when the search “music colleges” is done in Google, I am #7. When the term “music schools” is done, I am around 119. I MUST be getting a penalty for some reason, I just cannot figure the reason. When perform well for one term and terrible for the next? All I can come up with is a duplicate content penalty or something along those lines. Also, music-colleges.aspx seems to still be in Googles index, even though the above 301 happened months ago. Thoughts? site:www.careersinmusic.com/music-colleges.aspx Any insight into this would be GREATLY appreciated. Many Thanks!0 -
Does having multiple links to the same page influence the Link juice this page is able to pass
Say you have a page and it has 4 outgoing links to the same internal page. In the original Pagerank algo if these links were links to an page outside your own domain, this would mean that the linkjuice this page is able to pass would be devided by 4. The thing is i'm not sure if this is also the case when the outgoing link, is linking to a page on your own domain. I would say that outgoing links (whatever the destination) will use some of your link juice, so it would be better to have 1 outgoing link instead of 4 to the same destination, the the destination will profit more form that link. What are you're thoughts?
Intermediate & Advanced SEO | | TjeerdvZ0