Duplicate content: using the robots meta tag in conjunction with the canonical tag?
-
We have a WordPress instance on an Apache subdomain (let's say it's blog.website.com) alongside our main website, which is built in Angular. The tech team is using Akamai to do URL rewrites so that the blog posts appear under the main domain (website.com/more-keywords/here). However, due to the way they configured the WordPress install, they can't do a wildcard redirect under htaccess to force all the subdomain URLs to appear as subdirectories, so as you might have guessed, we're dealing with duplicate content issues.
They could in theory do manual 301s for each blog post, but that's laborious and a real hassle given our IT structure (we're a financial services firm, so lots of bureaucracy and regulation). In addition, due to internal limitations (they seem mostly political in nature), a robots.txt file is out of the question.
I'm thinking the next best alternative is the combined use of the robots meta tag (no index, follow) alongside the canonical tag to try to point the bot to the subdirectory URLs. I don't think this would be unethical use of either feature, but I'm trying to figure out if the two would conflict in some way? Or maybe there's a better approach with which we're unfamiliar or that we haven't considered?
-
Hey there,
Try to avoid using both canonical and noindex, it is not advised (source: https://www.seroundtable.com/noindex-canonical-google-18274.html).
If you are using canonical on the subdomain it should be more than enough to deindex the subdomain version (if in any way it gets indexed) and resolve the duplicate content issue.
Greetings, Keszi
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Canonical Tags for Legacy Duplicate Content
I've got a lot of duplicate pages, especially products, and some are new but most have been like this for a long time; up to several years. Does it makes sense to use a canonical tag pointing to one master page for each product. Each page is slightly different with a different feature and includes maybe a sentence or two that is unique but everything else is the same.
Technical SEO | | AmberHanson0 -
ViewState and Duplicate Content
Our site keeps getting duplicated content flagged as an issue... however, the pages being grouped together have very little in common on-page. One area which does seem to recur across them is the ViewState. There's a minimum of 150 lines across the ones we've investigated. Could this be causing the reports?
Technical SEO | | RobLev0 -
How different does content need to be to avoid a duplicate content penalty?
I'm implementing landing pages that are optimized for specific keywords. Some of them are substantially the same as another page (perhaps 10-15 words different). Are the landing pages likely to be identified by search engines as duplicate content? How different do two pages need to be to avoid the duplicate penalty?
Technical SEO | | WayneBlankenbeckler0 -
Is using a customer quote on multiple pages duplicate content?
Is there any risk with placing the same customer quote (3-4 sentences) on multiple pages on your site?
Technical SEO | | Charlessipe0 -
Tags causing Duplicate page content?
I was looking through the 'Duplicate Page Content' and Too Many On-Page Link' errors and they all seem to be linked to the 'Tags' on my blog pages. Is this really a problem and if so how should I be using tags properly to get the best SEO rewards?
Technical SEO | | zapprabbit1 -
Duplicate Meta Description in GWMT
We've just discovered that there are multiple duplicate URLs indexed for a site that we're working on. It seems that when new versions of the site was developed in the last couple of years, there were new page names and URL structures that were used. All of these seem to be showing up as Duplicate Meta Descriptions in Google's WMT, which is not surprising as they are basically the same page with the same content that are just sitting on different page names/URLs. This is an example of the situation, where URL 5 is the current version. Note: all the others are still live and resolve, although they are not linked to from the current site. URL 1: www.example.com/blue-tshirts.html (Version 1 - January 2010) URL 2: www.example.com/blue-t-shirts.html (Version 2 - July 2010) URL 3: www.example.com/blue_t_shirts.html (Version 3 - November 2010) URL 4: www.example.com/buy/blue_tshirts.html (Version 4 - January 2011) URL 5: www.example.com/buy/bluetshirts.html (Version 5 - April 2011) Presumably, this is a clear case of duplicate content. QUESTION: In order to solve it, shall we 301 all of the previous URLs to the current one - ie. Redirect URLs 1-4 to URL 5? Or, should some of them be NoIndexed? To complicate matters, there is Pagination on most of them. For example: URL 1: www.example.com/blue-tshirts.html (Version 1 - January 2010) URL 1a: www.example.com/page-1/blue-tshirts.html URL 1b: www.example.com/page-2/blue-tshirts.html URL 1c: www.example.com/page-3/blue-tshirts.html URL 4: www.example.com/buy/blue_tshirts.html URL 4a: www.example.com/buy/page-1/blue_tshirts.html URL 4b: www.example.com/buy/page-2/blue_tshirts.html URL 4c: www.example.com/buy/page-3/blue_tshirts.html URL 5: www.example.com/buy/bluetshirts.html URL 5a: www.example.com/buy/page-1/bluetshirts.html URL 5b: www.example.com/buy/page-2/bluetshirts.html URL 5c: www.example.com/buy/page-3/bluetshirts.html Since URL 5 is the current site, we are going to 'NoIndex, Follow' URLs 5a, 5b and 5c, which is what we understand to be the correct thing to do for paginated pages. QUESTION: What shall we do with URLs 1a, 1b and 1c? Should we apply the same "No Index, Follow" OR should they be 301'd to their respective counterparts in 5a, 5b and 5c? QUESTION: In the same way, since URL 4 is the version just before the current live Version 5, does it make a different on whether the paginated pages (ie 4a, 4b and 4c) should be No Indexed or 301'd? Thanks in advance for all responses and suggestions, it's greatly appreciated.
Technical SEO | | orangechew0 -
Duplicate Content
Hello All, my first web crawl has come back with a duplicate content warning for www.simodal.com and www.simodal.com/index.htm slightly mystified! thanks paul
Technical SEO | | simodal0 -
Duplicate content question with PDF
Hi, I manage a property listing website which was recently revamped, but which has some on-site optimization weaknesses and issues. For each property listing like http://www.selectcaribbean.com/property/147.html there is an equivalent PDF version spidered by google. The page looks like this http://www.selectcaribbean.com/pdf1.php?pid=147 my question is: Can this create a duplicate content penalty? If yes, should I ban these pages from being spidered by google in the robots.txt or should I make these link nofollow?
Technical SEO | | multilang0