Canonical Vs No Follow for Duplicate Products
-
I am in the process of migrating a site from Volusion to BigCommerce. There is a limitation on the ability to display one product in 2 different ways.
Here is the situation. One of the manufacturers will not allow us to display products to customers who are not logged in. We have convinced them to let us display the products with no prices. Then we created an Exclusive Contractor section that will allow users to see the price and be able to purchase the products online. Originally we were going to just direct users to call to make purchases like our competitors are doing. Because we have a large amount of purchasers online we wanted to manipulate the system to be able to allow online purchases.
Since these products will have duplicates with no pricing I was thinking that Canonical tags would be kind of best practice. However, everything will be behind a firewall with a message directing people to log in. Since this will undoubtedly create a high bounce rate I feel like I need to no follow those links. This is a rather large site, over 5000 pages. The 250 no follow URLs most likely won't have a large impact on the overall performance of the site. Or so I hope anyway. My gut tells me if these products are going to technically be hidden from the searcher they should also be hidden from the engines.
Does Disallowing these URLs seem like a better way to do this than simply using the Canonical tags? Any thoughts or suggestions would be really helpful!
-
They are different things for used for different reasons. By using the robots to block any page behind a log in you will not have to worry about them trying to access that information at all. You should also have canonical tags pointing to themselves on all pages especially product pages and landing pages.
-
I didn't think the engines could see the information. So if I understand you correctly you are saying that blocking the URLs in the .txt file is better than using a canonical tag right?
-
If your duplicate pages are behind a log-in you will be fine as content behind it cannot be seen by search engines. You should also block your logged in pages using your robots.txt.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate pages and Canonicals
Hi all, Our website has more than 30 pages which are duplicates. So canonicals have been deployed to show up only 10 of these pages. Do more of these pages impact rankings? Thanks
Intermediate & Advanced SEO | | vtmoz0 -
Schema for Product Categories
We have an E commerce site and we have started to implement Schema's. I've looked around quite a bit but could not find any schema's for product categories. Would there be any schema's to add besides an image, description, & occasional PDF?
Intermediate & Advanced SEO | | Mike.Bean0 -
Unpaid Followed Links & Canonical Links from Syndicated Content
I have a user of our syndicated content linking to our detailed source content. The content is being used across a set of related sites and driving good quality traffic. The issue is how they link and what it looks like. We have tens of thousands of new links showing up from more than a dozen domains, hundreds of sub-domains, but all coming from the same IP. The growth rate is exponential. The implementation was supposed to have canonical tags so Google could properly interpret the owner and not have duplicate syndicated content potentially outranking the source. The canonical are links are missing and the links to us are followed. While the links are not paid for, it looks bad to me. I have asked the vendor to no-follow the links and implement the agreed upon canonical tag. We have no warnings from Google, but I want to head that off and do the right thing. Is this the right approach? What would do and what would you you do while waiting on the site owner to make the fixes to reduce the possibility of penguin/google concerns? Blair
Intermediate & Advanced SEO | | BlairKuhnen0 -
Avoiding Duplicate Content with Used Car Listings Database: Robots.txt vs Noindex vs Hash URLs (Help!)
Hi Guys, We have developed a plugin that allows us to display used vehicle listings from a centralized, third-party database. The functionality works similar to autotrader.com or cargurus.com, and there are two primary components: 1. Vehicle Listings Pages: this is the page where the user can use various filters to narrow the vehicle listings to find the vehicle they want.
Intermediate & Advanced SEO | | browndoginteractive
2. Vehicle Details Pages: this is the page where the user actually views the details about said vehicle. It is served up via Ajax, in a dialog box on the Vehicle Listings Pages. Example functionality: http://screencast.com/t/kArKm4tBo The Vehicle Listings pages (#1), we do want indexed and to rank. These pages have additional content besides the vehicle listings themselves, and those results are randomized or sliced/diced in different and unique ways. They're also updated twice per day. We do not want to index #2, the Vehicle Details pages, as these pages appear and disappear all of the time, based on dealer inventory, and don't have much value in the SERPs. Additionally, other sites such as autotrader.com, Yahoo Autos, and others draw from this same database, so we're worried about duplicate content. For instance, entering a snippet of dealer-provided content for one specific listing that Google indexed yielded 8,200+ results: Example Google query. We did not originally think that Google would even be able to index these pages, as they are served up via Ajax. However, it seems we were wrong, as Google has already begun indexing them. Not only is duplicate content an issue, but these pages are not meant for visitors to navigate to directly! If a user were to navigate to the url directly, from the SERPs, they would see a page that isn't styled right. Now we have to determine the right solution to keep these pages out of the index: robots.txt, noindex meta tags, or hash (#) internal links. Robots.txt Advantages: Super easy to implement Conserves crawl budget for large sites Ensures crawler doesn't get stuck. After all, if our website only has 500 pages that we really want indexed and ranked, and vehicle details pages constitute another 1,000,000,000 pages, it doesn't seem to make sense to make Googlebot crawl all of those pages. Robots.txt Disadvantages: Doesn't prevent pages from being indexed, as we've seen, probably because there are internal links to these pages. We could nofollow these internal links, thereby minimizing indexation, but this would lead to each 10-25 noindex internal links on each Vehicle Listings page (will Google think we're pagerank sculpting?) Noindex Advantages: Does prevent vehicle details pages from being indexed Allows ALL pages to be crawled (advantage?) Noindex Disadvantages: Difficult to implement (vehicle details pages are served using ajax, so they have no tag. Solution would have to involve X-Robots-Tag HTTP header and Apache, sending a noindex tag based on querystring variables, similar to this stackoverflow solution. This means the plugin functionality is no longer self-contained, and some hosts may not allow these types of Apache rewrites (as I understand it) Forces (or rather allows) Googlebot to crawl hundreds of thousands of noindex pages. I say "force" because of the crawl budget required. Crawler could get stuck/lost in so many pages, and my not like crawling a site with 1,000,000,000 pages, 99.9% of which are noindexed. Cannot be used in conjunction with robots.txt. After all, crawler never reads noindex meta tag if blocked by robots.txt Hash (#) URL Advantages: By using for links on Vehicle Listing pages to Vehicle Details pages (such as "Contact Seller" buttons), coupled with Javascript, crawler won't be able to follow/crawl these links. Best of both worlds: crawl budget isn't overtaxed by thousands of noindex pages, and internal links used to index robots.txt-disallowed pages are gone. Accomplishes same thing as "nofollowing" these links, but without looking like pagerank sculpting (?) Does not require complex Apache stuff Hash (#) URL Disdvantages: Is Google suspicious of sites with (some) internal links structured like this, since they can't crawl/follow them? Initially, we implemented robots.txt--the "sledgehammer solution." We figured that we'd have a happier crawler this way, as it wouldn't have to crawl zillions of partially duplicate vehicle details pages, and we wanted it to be like these pages didn't even exist. However, Google seems to be indexing many of these pages anyway, probably based on internal links pointing to them. We could nofollow the links pointing to these pages, but we don't want it to look like we're pagerank sculpting or something like that. If we implement noindex on these pages (and doing so is a difficult task itself), then we will be certain these pages aren't indexed. However, to do so we will have to remove the robots.txt disallowal, in order to let the crawler read the noindex tag on these pages. Intuitively, it doesn't make sense to me to make googlebot crawl zillions of vehicle details pages, all of which are noindexed, and it could easily get stuck/lost/etc. It seems like a waste of resources, and in some shadowy way bad for SEO. My developers are pushing for the third solution: using the hash URLs. This works on all hosts and keeps all functionality in the plugin self-contained (unlike noindex), and conserves crawl budget while keeping vehicle details page out of the index (unlike robots.txt). But I don't want Google to slap us 6-12 months from now because it doesn't like links like these (). Any thoughts or advice you guys have would be hugely appreciated, as I've been going in circles, circles, circles on this for a couple of days now. Also, I can provide a test site URL if you'd like to see the functionality in action.0 -
Which index page should I canonical to?
Hello! I'm doing a routine clean up of my code and had a question about the canonical tag. On the index page, I have the following: I have never put any thought into which index path is the best to use. http://www.example.com http://www.example.com/ http://www.example.com/index.php Could someone shed some light on this for me? Does it make a difference? Thanks! Ryan
Intermediate & Advanced SEO | | Ryan_Phillips1 -
Removed Duplicate Domains, What Should I Expect?
Hi All, So I have been at my current company for 5 months now. I quickly realized that they previously bought multiple domains. The domains do make sense (they are mostly our products, etc). However they did not just redirect to our main website, instead, they were a direct copy of our main website. They had it setup so that when we made changes to our main website, www.mainwebsite.com, that the same exact change went to www.productwebsite.com. Basically we had about 7 of the SAME EXACT websites with a different root domain. So I explained to them the problem with having duplicate content on the web and how we are basically just self cannibalizing our online efforts. This problem is fixed now and I am just wondering if anybody has seen the results before? To tell you the truth we already do pretty well SEO-wise. Just wondering if this will make it even better? I am assuming that this will also take a little while to take effect? Thanks! Pat
Intermediate & Advanced SEO | | PatBausemer0 -
Directory VS Article Directory
Which got hit harder in penguin update. I was looking at SEER Interactive backlink profile (the SEO company that didn't rank for it's main keyword phrases) and noticed a pretty big trend on why it might not rank for its domain name. SEER was in a majority of anchor text, many coming from directories. i'm guessing THEY were effected because they matched the exact match domain link profile rule I'm not an expert programmer, but if i was playing "Google Programmer" I would think the Algo update went something like. If ((exact match domain) & (certain % anchor text==domain) & (certain % of anchor text== partial domain + services/company)) { tank the rankings } So back to the question, do you think that this update had a lot to do with directories, article directories, or neither. Is article directories still a legit way to get links. (not ezine)
Intermediate & Advanced SEO | | imageworks-2612900 -
Is there a way to stop my product pages with the "show all" catagory/attribute from duplicating content?
If there were less pages with the "show all" attribute it would be a simple fix by adding the canonical URL tag. But seeing that there are about 1,000 of them I was wondering if their was a broader fix that I could apply.
Intermediate & Advanced SEO | | cscoville0