Reducing pages with canonical & redirects
-
We have a site that has a ridiculous number of pages. Its a directory of service providers that is organized by city and sub-category of the vertical. Each provider is on the main city page, then when you click on a category, it will only show those folks who offer that subcategory of this service.
example:
- colorado/denver - main city page
- colorado/denver/subcat1 - subcategory page
There are 37 subcategories. So, 38 pages that essentially have the same content - minus a provider or two - for each city.
There are approx 40K locations in our database. So rough math puts us at 1.5 million results pages, with 97% of those pages being duplicate content!
This is clearly a problem. But many of these obscure pages do rank and get traffic. A fair amount when you aggregate all these pages together.
We are about to go through a redesign and want to consolidate pages so we can reduce the dupe content, get crawl budget allocated to more meaningful pages, etc.
Here's what I'm thinking we should do with this site, and I would love to have your input:
- Canonicalize
Before the redesign use the canonical tag on all the sub-category pages and push all the value from those pages (colorado/denver/subcat1, /subcat2, /subcat3... etc) to the main city page (colorado/denver/subcat1)
- 301 Redirect
On the new site (we're moving to a new CMS) we don't publish the duplicate sub-category pages and do 301 redirects from the sub-category URLs to the main city page urls.
We'd still have the sub-categories (keywords) on-page and use some Javascript filtering to narrow results.
We could cut to the chase and just do the redirects, but would like to use canonicalization as a proof of concept internally at my company that getting rid of these pages is a good thing, or at least wont have a negative impact on traffic. i.e. by the time we are ready to relaunch traffic and value has been transfered to the /state/city page
Trying to create the right plan and build my argument. Any feedback you have will help.
-
Hi! We're going through some of the older unanswered questions and seeing if people still have questions or if they've gone ahead and implemented something and have any lessons to share with us. Can you give an update, or mark your question as answered?
Thanks!
-
The best way is to make sure you're using the tag properly and that you have all your angles covered.
There is actually some good posts on SEOmoz about canonicalization, I'll try and find those for you.
-
awesome feedback! thanks david. would like to hear your thoughts on proper canonicalization when you have a moment. thanks again.
-
Your plan sounds good but here are a few things I'd like to add.
-
Make sure the dupe pages you're getting rid of are not the main traffic sources. If that is the case you'll want to redirect only a few at a time and slowly go around fixing that. You don't want to switch to new CMS, throw up redirects, and lose 85% of your traffic. Just make sure it's not your main traffic source.
-
Make sure you use the proper methods of canonicalization. Don't half-ass it.
-
On the new site, because you have a large and deep site, make sure you have a proper sitemap generated fresh all the time and that the proper weights are assigned and proper structuring. Less levels = better.
-
Watch your Webmaster Tools.
That is all I have, I think you'll be fine.
-
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Index, follow on a paginated page with a different rel=canonical URL
Hello, I have a question about meta robots ="index, follow" and rel=canonical on category page pagination. Should the sorted page be <meta name="robots" content="index,follow"></meta name="robots" content="index,follow"> since the rel="canonical" is pointing to a separate page that is different from the URL? Any thoughts on this topic would be awesome. Thanks. Main Category Page
Intermediate & Advanced SEO | | Choice
https://www.site.com/category/
<meta name="robots" content="index,follow"><link rel="canonical" href="https: www.site.com="" category="" "=""></link rel="canonical" href="https:></meta name="robots" content="index,follow"> Sorted Page
https://www.site.com/category/?p=2&dir=asc&order=name
<meta name="robots" content="index, follow"=""><link rel="canonical" href="https: www.site.com="" category="" ?p="2""></link rel="canonical" href="https:></meta name="robots" content="index,> As you can see, the meta robots is telling Google to index https://www.site.com/category/?p=2&dir=asc&order=name , yet saying the canonical page is https://www.site.com/category/?p=2 .0 -
HTTPS & Redirects
Hi We're moving to https imminently & I wondered if anyone has advice on redirects. Obviously we'll be redirecting all http versions to https - but should I be checking how many redirects are in each chain and amending accordingly? If there's 4-5 in a chain, remove the middle unnecessary URLS ? Advice please 🙂
Intermediate & Advanced SEO | | BeckyKey0 -
Why does Google rank a product page rather than a category page?
Hi, everybody In the Moz ranking tool for one of our client's (the client sells sport equipment) account, there is a trend where more and more of their landing pages are product pages instead of category pages. The optimal landing page for the term "sleeping bag" is of course the sleeping bag category page, but Google is sending them to a product page for a specific sleeping bag.. What could be the critical factors that makes the product page more relevant than the category page as the landing page?
Intermediate & Advanced SEO | | Inevo0 -
Robots.txt, Disallow & Indexed-Pages..
Hi guys, hope you're well. I have a problem with my new website. I have 3 pages with the same content: http://example.examples.com/brand/brand1 (good page) http://example.examples.com/brand/brand1?show=false http://example.examples.com/brand/brand1?show=true The good page has rel=canonical & it is the only page should be appear in Search results but Google has indexed 3 pages... I don't know how should do now, but, i am thinking 2 posibilites: Remove filters (true, false) and leave only the good page and show 404 page for others pages. Update robots.txt with disallow for these parameters & remove those URL's manually Thank you so much!
Intermediate & Advanced SEO | | thekiller990 -
Pagination on a product page with reviews spread out on multiple pages
Our current product pages markup only have the canonical URL on the first page (each page loads more user reviews). Since we don't want to increase load times, we don't currently have a canonical view all product page. Do we need to mark up each subsequent page with its own canonical URL? My understanding was that canonical and rel next prev tags are independent of each other. So that if we mark up the middle pages with a paginated URL, e.g: Product page #1http://www.example.co.uk/Product.aspx?p=2692"/>http://www.example.co.uk/Product.aspx?p=2692&pageid=2" />**Product page #2 **http://www.example.co.uk/Product.aspx?p=2692&pageid=2"/>http://www.example.co.uk/Product.aspx?p=2692" />http://www.example.co.uk/Product.aspx?p=2692&pageid=3" />Would mean that each canonical page would suggest to google another piece of unique content, which this obviously isn't. Is the PREV NEXT able to "override" the canonical and explain to Googlebot that its part of a series? Wouldn't the canonical then be redundant?Thanks
Intermediate & Advanced SEO | | Don340 -
Big discrepancies between pages in Google's index and pages in sitemap
Hi, I'm noticing a huge difference in the number of pages in Googles index (using 'site:' search) versus the number of pages indexed by Google in Webmaster tools. (ie 20,600 in 'site:' search vs 5,100 submitted via the dynamic sitemap.) Anyone know possible causes for this and how i can fix? It's an ecommerce site but i can't see any issues with duplicate content - they employ a very good canonical tag strategy. Could it be that Google has decided to ignore the canonical tag? Any help appreciated, Karen
Intermediate & Advanced SEO | | Digirank0 -
Duplicate page content and Duplicate page title errors
Hi, I'm new to SeoMoz and to this forum. I've started a new campaign on my site and got back loads of error. Most of them are Duplicate page content and Duplicate page title errors. I know I have some duplicate titles but I don't have any duplicate content. I'm not a web developer and not so expert but I have the impression that the crawler is following all my internal links (Infact I have also plenty of warnings saying "Too many on-page links". Do you think this is the cause of my errors? Should I implement the nofollow on all internal links? I'm working with Joomla. Thanks a lot for your help Marco
Intermediate & Advanced SEO | | marcodublin0 -
Does having multiple links to the same page influence the Link juice this page is able to pass
Say you have a page and it has 4 outgoing links to the same internal page. In the original Pagerank algo if these links were links to an page outside your own domain, this would mean that the linkjuice this page is able to pass would be devided by 4. The thing is i'm not sure if this is also the case when the outgoing link, is linking to a page on your own domain. I would say that outgoing links (whatever the destination) will use some of your link juice, so it would be better to have 1 outgoing link instead of 4 to the same destination, the the destination will profit more form that link. What are you're thoughts?
Intermediate & Advanced SEO | | TjeerdvZ0