Category Pages - Canonical, Robots.txt, Changing Page Attributes
-
A site has category pages as such: www.domain.com/category.html, www.domain.com/category-page2.html, etc...
This is producing duplicate meta descriptions (page titles have page numbers in them so they are not duplicate). Below are the options that we've been thinking about:
a. Keep meta descriptions the same except for adding a page number (this would keep internal juice flowing to products that are listed on subsequent pages). All pages have unique product listings.
b. Use canonical tags on subsequent pages and point them back to the main category page.
c. Robots.txt on subsequent pages.
d. ?
Options b and c will orphan or french fry some of our product pages.
Any help on this would be much appreciated. Thank you.
-
I see. I think the concern is with duplicate content though, right?
-
Either way, it will be tough to go that route and still get indexed. Its a pagination issue that everyone would like a solution to, but there just isnt one. It wont hurt you to do this, but wont ultimately get all those pages indexed like you want.
-
Disagree. I think you are missing out big time here- category pages are the bread and butter for eCommerce sites. Search engines have confirmed that these pages are of high value for users, and it gets you a chance to have optimized static content on a page that also shows product results. All the major e retailers heavily rely on these pages (Amazon, ebay, zappos, etc...)
-
Sorry, I don't think I clarified. The page title and meta descriptions would be unique, however they would be almost the same except for it saying "Page [x}" somewhere within it.
-
Option A doesnt do anything for you. I think the search engines flag duplicated title tags, even with different products on the page.
-
Thanks for the comprehensive response, Ryan; really great info here!
Would option A be out of the question in your mind due to the fact that the page attributes would be too similar even though unique content is on all the subsequent category pages? I know this method isn't typical, however, it would be the most efficient way to address.
Note: A big downside to this is also the fact that we will have multiple pages targeting the same keyword, however, since internally and externally, the main category pages are getting more link love, would it still hurt to have all those subsequent pages getting indexed?
-
Ahh... the ultimate IA question that still doesnt have a clear anwer from the search engines. A ton of talk about this at the recent SMX Advanced at Seattle (as is with almost every one). I will try and summarize the common sentiment that i gathered from other pros. I will not claim that this is the correct way, but for now this is what i heard a bunch of people agree on:
- No index, follow the pagination links for all except page 1
- Do not block/hand it with robots.txt (in your case, you realyl cant since you have no identifying parameters in your url)
- If you had paginated parameters in the url you can also manage those in the Google & Bing WMT by telling the SE to ignore those certain parameters.
- Canonical to page 1 was a strategy that some retailers were using, and other want to try. Google reps tried to say this is not the way to do it, but others claim success from it.
- If you have a "View All" link that would display all the products in a longer form on a single page, canonical to that page (if its reasonable)
Notes: Depending on how your results/pages are generated, you will need to remember that they probably arent passing "juice". Any dynamic content is usually not "flow through" links from an SEO perspective (or even crawled sometimes).
The better approach to not orphaning your product pages is finding ways to link to them from other sources besides the results pages. For larger sites, its a hassle, buts thats a challenge we all face
Here are some SEO tips for attacking the "orphan" issue:
- If you have product feeds, create a "deal" or "price change" feed. Create a twitter account that people can sign up for to follow these new deals or price changes on products. Push in your feed into tweets, and these will link to your product page, hence creating an in-link for search engines to follow.
- Can do the same with blogs or facebook, but not on a mass scale. Something a bit more useful for users like "top 10 deals of the week) and link to 10 products, or "Favorites for gifts" or something. over time, you can keep track of which product you recommend, and make sure you eventually hit all your products. Again, the point is creating at least 1 inbound link for search engines to follow.
- Create a static internal "product index page" (this is not for your sitemap page FYI) where either by category or some other structure, you make a static link to every product page you have on the site. Developers can have these links dynamically updated/inserted with some extra effort which will avoid manually needing to be updated.
- Create a xml sitemap index. Instead of everything being clumped into 1 xml sitemap for your site, try creating a sitemap index and with your product pages in their own sitemap. This may help with indexing those pages.
Hope that helps? Anyone else want to chime in?
-
I think that generally speaking you want to block search engines from indexing your category pages (use your sitemap and robots.txt to do this). I could be totally wrong here but that is how I setup my sites.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Why is Google Ranking the Umbrella Category Page when Searching for Sub-Categories Within that Umbrella Category?
I have an e-commerce client who sells shoes. There is a main page for "Kids" shoes, and then right under it on the top-navigation bar there is a link to "Boys Shoes" and "Girls Shoes." All 3 of these links are on the same level - 1 click off the home page. (And linked to from every page on the website via the top nav bar). All 3 are perfectly optimized for their targeted term. However, when you search for "boys shoes" or "girls shoes" + the brand, the "Kids" page is the one that shows up in the #1 position. There are sitelinks beneath the listing pointing to "Girls" and "Boys." All the other results in Google are resellers of the "brand + girls" or "brand + boys" shoes. So our listing is the only one that's "brand + kids shoes." Our "boys" shoes page and "girls" shoes page don't even rank on the 1st page for "brand + boys shoes" or "brand + girls shoes." The only real difference is that "kids shoes" contains both girls and boys shoes on the page, and then "boys" obviously contains boys' shoes only, "girls" contains girls' shoes only. So in that sense there is more content on the "kids" page. So my question is - WHY is the kids page outranking the boys/girls page? How can we make the boys/girls pages be the ones that show up when people specifically search for boys/girls shoes?
Intermediate & Advanced SEO | | FPD_NYC0 -
Robots.txt
Hi all, Happy New Year! I want to block certain pages on our site as they are being flagged (according to my Moz Crawl Report) as duplicate content when in fact that isn't strictly true, it is more to do with the problems faced when using a CMS system... Here are some examples of the pages I want to block and underneath will be what I believe to be the correct robots.txt entry... http://www.XYZ.com/forum/index.php?app=core&module=search&do=viewNewContent&search_app=members&search_app_filters[forums][searchInKey]=&period=today&userMode=&followedItemsOnly= Disallow: /forum/index.php?app=core&module=search http://www.XYZ.com/forum/index.php?app=core&module=reports&rcom=gallery&imageId=980&ctyp=image Disallow: /forum/index.php?app=core&module=reports http://www.XYZ.com/forum/index.php?app=forums&module=post§ion=post&do=reply_post&f=146&t=741&qpid=13308 Disallow: /forum/index.php?app=forums&module=post http://www.XYZ.com/forum/gallery/sizes/182-promenade/small/ http://www.XYZ.com/forum/gallery/sizes/182-promenade/large/ Disallow: /forum/gallery/sizes/ Any help \ advice would be much appreciated. Many thanks Andy
Intermediate & Advanced SEO | | TomKing0 -
"No index" page still shows in search results and paginated pages shows page 2 in results
I have "no index, follow" on some pages, which I set 2 weeks ago. Today I see one of these pages showing in Google Search Results. I am using rel=next prev on pages, yet Page 2 of a string of pages showed up in results before Page 1. What could be the issue?
Intermediate & Advanced SEO | | khi50 -
Canonical Tag for Pages with Less Content
I am considering using a cross-domain canonical tag for pages that are very similar but one has less content than the other. The domains are geo specific, so for example. www.page.com - with content xxx, yyy, zzz, and www.page.fr with content xxx is this a problem because while there is clearly duplicate content here the pages are not actually significantly similar since there is so much less content on one page than the other?
Intermediate & Advanced SEO | | theLotter0 -
Robots.txt: Can you put a /* wildcard in the middle of a URL?
We have noticed that Google is indexing the language/country directory versions of directories we have disallowed in our robots.txt. For example: Disallow: /images/ is blocked just fine However, once you add our /en/uk/ directory in front of it, there are dozens of pages indexed. The question is: Can I put a wildcard in the middle of the string, ex. /en/*/images/, or do I need to list out every single country for every language in the robots file. Anyone know of any workarounds?
Intermediate & Advanced SEO | | IHSwebsite0 -
Changing a parent category and 301 redirecting
I have a set of three pages that are subpages of a parent. The structure is as follows: mysite.com/directory/personal-widgets mysite.com/directory/commercial-widgets mysite.com/directory/widgets-services The partent page name "directory" really isn't working for where I want these pages to evolve. So I want to change it to "guides" In a world without worrying about google, I would simply change the parent page to guides, so they look like this, and be done with it: mysite.com/guides/personal-widgets But, the obvious problem is that I have external links to the page now. And the pages have a nice PR. And they also have Facebook page Likes and I don't know if I'll lose those. I know that if I should do this I should redirect the pages to the new pages of course. My question is: Will redirecting the old URL to the new URL with a 301 cause anything negative to happen that I might not be expecting? Does Google dislike Redirects for any reason, or understand they are sometimes necessary?
Intermediate & Advanced SEO | | bizzer0 -
Canonical tag for similar page with different theme.
Our commerce system allows products to be shared across multiple categories/sections of our site. E.G. /boxes/blue-box.html /circles/blue-box.html This enables the product to show up in different areas of the site, but does not link to an evergreen URL. We are considering using the canonical tag to resolve this issue, but our question relates to the similarity of the pages. Each section folder (e.g. /boxes/ and /circles/) has a different header, left navigation and footer. They are similar in layout and some content is the same, but a good portion is different in the header and nav. Each category nav basically deals with deeper links in it's own category. The product title, image, description, etc. is all the same and makes up the bulk of the page. Is this a good candidate for the canonical tag or should we attempt to accommodate an evergreen URL?
Intermediate & Advanced SEO | | josh-att0 -
How do I fix the error duplicate page content and duplicate page title?
On my site www.millsheating.co.uk I have the error message as per the question title. The conflict is coming from these two pages which are effectively the same page: www.millsheating.co.uk www.millsheating.co.uk/index I have added a htaccess file to the root folder as I thought (hoped) it would fix the problem but I doesn't appear to have done so. this is the content of the htaccess file: Options +FollowSymLinks RewriteEngine On RewriteCond %{HTTP_HOST} ^millsheating.co.uk RewriteRule (.*) http://www.millsheating.co.uk/$1 [R=301,L] RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index\.html\ HTTP/ RewriteRule ^index\.html$ http://www.millsheating.co.uk/ [R=301,L] AddType x-mapp-php5 .php
Intermediate & Advanced SEO | | JasonHegarty0