Magento Dublicate Content (Noindex and Rel"canonical")
-
Hi All,
Just looking for some advice regarding my website on magento.
We by mistake didnt enable canonical tags and noindex tags so had a big problem with dublicate content from filter pages but also have URLs to Cats as Yes so this didnt help with not having canonical tags enabled.
We now have everything enabled for a few weeks now but dont see much drop in indexed pages in google. (currently 27k and we have only 5k products)
My question basically is how do we speed up noindexation of dublicate content and also would you change URL to cats as No so google just now sees the url to products? (my concerns with this is would leaving it to Yes help because it will hopefully read the canonical tags on products now)
Thank you in advance
Michael
-
Hi Carson
Thank you for replying and the indepth answers.
I did read somewhere that dublicate content on your own website isnt too bad but im glad you have helped me clear things up.
So would you change cat urls to no or leave them to yes for now till google can see all the canoical tags on products?
Thanks
Mike
-
I think there's an underlying assumption here that duplicate content will harm your site, and that's not necessarily true. There's no "duplicate content penalty" - it's more than a filter. Google is better than most at recognizing this, especially with common CMS like Magento and WP. Google attempts to look at the links going to both pages and understand their authority together.
Duplicate content is more of an issue if you're pulling content that others are using as well, e.g. on product descriptions provided by manufacturers and other types of content. Google won't "penalize" you, but they will sometimes filter your site out in favor of the most authoritative site with that content. It's also an issue (mostly for Panda) if you're creating keyword pages that contain duplicate of even very-similar content just to rank for a bunch of very similar keywords.
So my first bit of advice is, "don't obsess over intra-site duplicate content."
That said, it's best to reduce and avoid duplicate content 1) for less-sophisticated search engine, 2) for the sake of your own analytics data integrity and simplicity, 3) just in case Google doesn't get it (very rare).
Set the categories up however you think is best for the user (generally just the product name without categories), double-check the canonical URLs, and wait for Google to catch up on the canonical and noindex. It can take many months depending on your site's authority, but it's unlikely to move the needle either way. Keep in mind that Google may keep pages in the index even if they are honoring the canonical tag - they'll just show the canonical version but keep both indexed. That's working as intended - don't worry about that
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Is there an percentage of duplicate content required before you should use a canonical tag?
Is there a percentage (approximate or exact) of duplicate content you should have before you use a canonical tag? Similarly how does Google handle canonical tags if the pages aren’t 100% duplicate? I've added some background and an example below; Nike Trainer model 1 – has an overview page that also links to a sub-page about cushioning, one about Gore-Tex and one about breathability. Nike Trainer model 2,3,4,5 – have an overview page that also links to sub-pages page about cushioning , Gore-Tex and breathability. In each of the sub-pages the URL is a child of the parent so a distinct page from each other e.g. /nike-trainer/model-1/gore-tex /nike-trainer/model-2/gore-tex. There is some differences in material composition, some different images and of course the product name is referred multiple times. This makes the page in the region of 80% unique.
Technical SEO | | punchseo0 -
"Equity sculpting" with internal nofollow links
I’ve been trying a couple of new site auditor services this week and they have both flagged the fact that I have some nofollow links to internal pages. I see this subject has popped up from time to time in this community. I also found a 2013 Matt Cutts video on the subject: https://searchenginewatch.com/sew/news/2298312/matt-cutts-you-dont-have-to-nofollow-internal-links At a couple of SEO conferences I’ve attended this year, I was advised that nofollow on internal links can be useful so as not to squander link juice on secondary (but necessary) pages. I suspect many websites have a lot of internal links in their footers and are sharing the love with pages which don’t really need to be boosted. These pages can still be indexed but not given a helping hand to rank by strong pages. This “equity sculpting” (I made that up) seems to make sense to me, but am I missing something? Examples of these secondary pages include login pages, site maps (human readable), policies – arguably even the general contact page. Thoughts? Regards,
Technical SEO | | Warren_Vick
Warren1 -
Does my "spam" site affect my other sites on the same IP?
I have a link directory called Liberty Resource Directory. It's the main site on my dedicated IP, all my other sites are Addon domains on top of it. While exploring the new MOZ spam ranking I saw that LRD (Liberty Resource Directory) has a spam score of 9/17 and that Google penalizes 71% of sites with a similar score. Fair enough, thin content, bunch of follow links (there's over 2,000 links by now), no problem. That site isn't for Google, it's for me. Question, does that site (and linking to my own sites on it) negatively affect my other sites on the same IP? If so, by how much? Does a simple noindex fix that potential issues? Bonus: How does one go about going through hundreds of pages with thousands of links, built with raw, plain text HTML to change things to nofollow? =/
Technical SEO | | eglove0 -
Rel Canonical Crawl Notices
Hello, Within the Moz report from the crawl of my site, it shows that I had 89 Rel Canonical notices. I noticed that all the pages on my site have a rel canonical tag back to the same page the tag is on. Specific example from my site is as follows: http://www.automation-intl.com/resistance-welding-equipment has a Rel Canonical tag <link rel="<a class="attribute-value">canonical</a>" href="http://www.automation-intl.com/resistance-welding-equipment" />. Is this self reference harmless and if so why does it create a notice in the crawl? Thanks in advance.
Technical SEO | | TopFloor0 -
Rel = prev next AND canonical?
I have product category pages that correctly have the prev next but the moz crawl is giving me duplicate content errors. I would not think I also need to have canonical - but do I ?
Technical SEO | | JohnBerger0 -
How unique does a page need to be to avoid "duplicate content" issues?
We sell products that can be very similar to one another. Product Example: Power Drill A and Power Drill A1 With these two hypothetical products, the only real difference from the two pages would be a slight change in the URL and a slight modification in the H1/Title tag. Are these 2 slight modifications significant enough to avoid a "duplicate content" flagging? Please advise, and thanks in advance!
Technical SEO | | WhiteCap0 -
TLD - ".com.br" X ".com" which to use?
Hello I'm starting an SEO work on a site that has the domain "www.dominiodocliente.com" and "www.dominiodocliente.com.br." The problem is that the domain name. ".com" already has a low rank for keywords chosen as the domain "Com.br" has no rank. On the other hand, the domain ". Com" has 224 results in google as the domain "Com.br" has 1970 results. My question is: Which domain should I focus on SEO work? Tks
Technical SEO | | eder.machado0 -
What to do about "blocked by meta-robots"?
The crawl report tells me "Notices are interesting facts about your pages we found while crawling". One of these interesting facts is that my blog archives are "blocked by meta robots". Articles are not blocked, just the archives. What is a "meta" robot? I think its just normal (since the article need only be crawled once) but want a second opinion. Should I care about this?
Technical SEO | | GPN0