Duplicate Content
-
The crawl shows a lot of duplicate content on my site. Most of the urls its showing are categories and tags (wordpress).
so what does this mean exactly? categories is too much like other categories? And how do i go about fixing this the best way.
thanks
-
Greg
Thanks so much for helping out! If you don't mind I'm just going to correct a few finer details so people don't confuse anything
"Essentially the tags display the exact content as the original URL so the pages are identical but the URL is different."
Its totally true that this happens, but this is not what causes the duplicate content error in the crawl report. The errors are usually from sub-pages of any given tag archive having the same title tag.
"Remove the tags"
By this I'm sure you just mean noindex tags. You don't need to remove them from the site altogether, just remove them from the index.
"If you want the Tags and Categories for user experience, Install Yoast SEO plugin which allows you to insert a canonical URL on the duplicate category pages."
You should leave categories indexed and noindex tags. Yoast does canonicals no matter what, you don't need to think about them and they are not what handles duplicate category pages.
Everything else stated is more or less ok but I just don't people to be confused.
Thanks again!
-Dan
-
Justin
Sorry to hear of your trouble with making the new settings. For one, my guide on SEOmoz about setting up WordPress for SEO should be helpful. I'd recommend familiarizing yourself with that.
In these cases - the "duplicate content" is usually not the page its self but rather usually just the title tags.
This is because, imagine you have tag archives like this;
- mydomain.com/tag/pink-elephants/
- mydomain.com/tag/pink-elephants/page/2/
- mydomain.com/tag/pink-elephants/page/3/
Usually the title tags respectably end up being the same;
- Pink Elephants | My Domain
- Pink Elephants | My Domain <-- title tag for page 2
- Pink Elephants | My Domain <-- title tag for page 3
For every single tag "subpage".
Normally, the protocol would be to;
- Noindex subpages
- Noindex tags
- Noindex dated archives
- Disable author archives (single author blog only)
- Index categories
You can still link to tag pages and use tags within the site all you want, but you just don't want to index them.
These are just default settings. Its impossible to know exactly what you should be doing without seeing your site, but I hope all of that gets you in the right direction!
-Dan
-
You should only no-follow your tags and archives and not your categories...
In the plugin settings, under permalinks, there is an option
"Strip the category base (usually
/category/
) from the category URL." this will just stop the duplicate pages from appearing,Blocking the category's must have caused the drop.
Greg
-
Changed to Yoast. I ticked no follow on archives, categories, and tags. One hour later, website went from #7 to page four.
-
Well, the duplicate content is causing issues alone.. Google does not like duplicate pages at all...
If you select which are your primary pages, and tell google to ignore the rest, it can only help your ranking.
With the Yoast SEO plugin, all you need to do is set tags to no-follow and no-index, and also strip the category from the URL. (it redirects automatically, as well)
Greg
-
Thanks for the reply. Would this affect ranking or can it be left alone ?
-
Wordpress does this when you use tags....
Essentially the tags display the exact content as the original URL so the pages are identical but the URL is different.
2 Options that i can think of.
1.) Remove the tags and strip the category segment in the URL and stop using them in future. This will require redirects from duplicate URL"s to the main article (this will take planning, allot of time and is quite complicated)
2.) If you want the Tags and Categories for user experience, Install Yoast SEO plugin which allows you to insert a canonical URL on the duplicate category pages. This tells Google were the original page can be found. Tags are only their for user experience so you can set these to no-follow and no-index.
Greg
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate content issue on Magento platform
We have a lot of duplicate pages (600 urls) on our site (total urls 800) built on the Magento e-commerce platform. We have the same products in a number of different categories that make it easy for people to choose which product suits their needs. If we enable the canonical fix in Magento will it dramatically reduce the number of pages that are indexed. Surely with more pages indexed (even though they are duplicates) we get more search results visibility. I'm new to this particular SEO issue. What do the SEO community have to say on this matter. Do we go ahead with the canonical fix or leave it?
Technical SEO | | PeterDavies0 -
Minimising the effects of duplicate content
Hello, We realised that one of our clients, copied a large part of content from our website to his. The normal reaction would be to send a cease and desist letter. Nevertheless this would probably mean loosing a good client. The client dumped the text of several articles (for example:
Technical SEO | | Lvet
http://www.velascolawyers.com/en/property-law/136-the-ley-de-costas-coastal-law.html ) Into the same page:
http://www.freundlinger-partners.com/en/home/faqs-property-law/ I convinced the client to place our authorship tags on this page, but I am wondering if this is enough. What do you think? Cheers
Luca0 -
Duplicate Content - Captcha on Contact Form
I am going to be working on a site where the contact form is being flagged as duplicate content the URL is the same apart from having: /contact/10119 contact/31010 ...at the end of it. The only difference in the content of the page that I can see is the Captcha numbers? Is there a way to overcome this to stop duplicate content? Thanks in advance
Technical SEO | | J_Sinclair0 -
Showing duplicate content when I have canonical url set, why?
Just inspecting my sites report and I see that I have a lot of duplicate content issues, not sure why these two pages here http://www.thecheapplace.com/wholesale-products/Are-you-into-casual-sex-patch http://www.thecheapplace.com/wholesale-products/small-wholesale-patches-1/Are-you-into-casual-sex-patch are showing as duplicate content when both pages have a clearly defined canonical url of http://www.thecheapplace.com/Are-you-into-casual-sex-patch Any answer would be appreciated, thank you
Technical SEO | | erhansimavi0 -
Duplicate content / title caused by CAPITALS
What is the best way to stop duplicate content warning (and Google classing them as duplicate content), when it is caused by CAPITALS (i.e www.domain.com/Directory & www.domain.com/directory ). I try to always use lower case (unless a place name then i use Capitals for the first letter), but it looks like i have slipped up and got some mixed up and other sites will also be linking to Capitals Thanks Jon
Technical SEO | | jonny5123790 -
Whats with the backslash in the url adding as duplicate content?
Is this a bug or something that needs to be addressed? If so, just use a redirect?
Technical SEO | | Boogily0 -
How to publish duplicate content legitimately without Panda problems
Let's imagine that you own a successful website that publishes a lot of syndicated news articles and syndicated columnists. Your visitors love these articles and columns but the search engines see them as duplicate content. You worry about being viewed as a "content farm" because of this duplicate content and getting the Panda penalty. So, you decide to continue publishing the content and use... <meta name="robots" content="noindex, follow"> This allows you do display the content for your visitors but it should stop the search engines from indexing any pages with this code. It should also allow robots to spider the pages and pass link value through them. I have two questions..... If you use "noindex" will that be enough to prevent your site from being considered as a content farm? Is there a better way to continue publication of syndicated content but protect the site from duplicate content problems?
Technical SEO | | EGOL0 -
How do i deal with duplicate content on the same domain?
I'm trying to find out if there's a way we can combat similar content on different pages on the same site, without having to re write the whole lot? Any ideas?
Technical SEO | | indurain0