Duplicate pages, overly dynamic URL’s and long URL’s in Magento
-
Hi there,
I’ve just completed the first crawl of my Magento site and SEOMOZ has picked up 1,000’s of duplicate pages, overly dynamic URL’s and long URL’s due to the sort function which appends URL’s with variables when sorting products (e.g. www.example.com?dir=asc&order=duration).
I’m not particularly concerned that this will affect our rankings as Google has stated that they are familiar with the structure of popular CMS’s and Magento is pretty popular.
However it completely dominates my crawl diagnostics so I can’t see if there are any real underlying issues.
Does anyone know a way of preventing this?
Cheers,
Al. -
You should use the Yoast Robots extension to fix almost all the duplicate content.
http://www.magentocommerce.com/magento-connect/yoast-metarobots.html
When using 2.0 Magento connect: http://connect20.magentocommerce.com/community/Yoast_MetaRobots
for 1.0 use: magento-community/Yoast_MetaRobots
Also use canonical URL. You can find this at the admin panel:
System - Configuration - Catalog - Canonical links for catagories
System - Configuration - Catalog - Canonical links for products
-
I'm actually a fan of selectively (programmatically) NOINDEX'ing like that. I find that the GWT parameter blocking doesn't always scale well. I'm running into a lot of clients trying to use it on 100s or 1000s (or millions, actually) of pages and Google is mostly ignoring it. Very frustrating.
We're working on features to let you ignore certain warnings/notices if you feel they don't apply, I but I do believe in being proactive about indexation issues. I think they matter a lot more than they used to, especially post-Panda.
I would double-check to see if there's a Magento plug-in to help, as this could be a common problem. Unfortunately, we don't have any Magento experts on-staff. I'll leave this open as a discussion question, in case any members have specific expertise.
-
Is it worth trying to tackle this programmatically e.g. if url includes dir= or limit= or order= then include a noindex meta tag on that page?
It’s easy to exclude these parameters in Google Webmaster tools, but again I’d really like to reduce the number of errors reported by seoMOZ as currently I have 10,000 errors due to duplicate content!
-
Hey Harald, Thanks for your response - I've come across that article whilst googling the issue, but it doesn't specifically deal with the duplicate URL's being crawled and being included in SEOmoz reports. As I say I'm not too worried about any negative impact here as I've implemented canonical URL's and I have a sitemap - however it ruins my SEOmoz crawl diagnostic report by creating 1,000's of errors. Cheers, Al.
-
Hi Almenzies, As you mentioned that SEOmoz repots you by telling that there area 1000 of pages which are having the issues of duplicate content , so below is alink which solves the Duplicate content issues:
Solving the Duplicate Content Issues in Magento.
I hope that your query had been solved.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate, submitted URL not selected as canonical
Hi all, A number of our pages have dropped out of search rankings. It seems they are being marked as "Duplicate, submitted URL not selected as canonical" However, the page Google is choosing as the canonical is totally different - different headings, titles, metadata, content on the page. We are completely mystified as to why this is happening. If anyone can shed any light, it would be hugely appreciated! Example URL is this one:
Technical SEO | | Eric_S
https://www.vouchedfor.co.uk/IFA-financial-advisor-mortgage/london Which Google seems to think is a duplicate of this: https://www.vouchedfor.co.uk/solicitor/london0 -
Weird, long URLS returning crawl error
Hi everyone, I'm getting a crawl error "URL too long" for some really strange urls that I'm not sure where they are being generated from or how to resolve it. It's all with one page, our request info. Here are some examples: http://studyabroad.bridge.edu/request-info/?program=request info > ?program=request info > ?program=request info > ?program=request info > ?program=programs > ?country=country?type=internships&term=short%25 http://studyabroad.bridge.edu/request-info/?program=request info > ?program=blog > notes from the field tefl student elaina h in chile > ?utm_source=newsletter&utm_medium=article&utm_campaign=notes%2Bfrom%2Bthe%2Bf Has anyone seen anything like this before or have an idea of what may be causing it? Thanks so much!
Technical SEO | | Bridge_Education_Group0 -
Duplicate Titles on Wordpress blog pages
Hi, I have an issue where I am getting for duplicate page titles for pages that shouldn't exist. The issue is on the blog index page's (from 0 - 16) and involves the same set of attachment_id for each page, i.e. /blog/page/10/?attachment_id=minack /blog/page/10/?attachment_id=ponyrides /blog/page/11/?attachment_id=minack /blog/page/11/?attachment_id=ponyrides There are 6 attachment_id values (and they are not ID values either) which repeat for every page on the index now what I can't work out is where those 6 links are coming from as on the actual blog index page http://www.bosinver.co.uk/blog/page/10/ there are no links to it and the links just go to blog index page and it ignores the attachment_id value. There is no sitemap.xml file either which I thought might have contained the links. Thanks
Technical SEO | | leapSEO0 -
Using Rel Nofollow on Duplicate Pages
Hi there, I have a rather large site that has duplicate content on many pages due to how it's being spidered by google. I was hoping I could set the internal link to this page as "nofollow." My question is that I have hundreds of other sites with backlinks to these duplicate content pages.. will this affect me negatively if I tell google not to index the duplicated pages?
Technical SEO | | trialminecraftserverfinder0 -
Are you allowed to point different urls to same page
hi, i have some urls that i am either going to put on hold or thinking about pointing to one of my sites. what it is, i am looking at re-designing the pages but not until next year, so i thought i would point some of the urls to a site that i am happy with to different pages, but not sure if i am allowed this or not so for example, if i have a site on cars, and one of the url is www.rovercars.co.uk i was thinking about pointing it to the page that is about rover cars. can anyone let me know if this is allowed or not please
Technical SEO | | ClaireH-1848860 -
Can dynamically translated pages hurt a site?
Hi all...looking for some insight pls...i have a site we have worked very hard on to get ranked well and it is doing well in search. The site has about 1000 pages and climbing and has about 50 of those pages in translated pages and are static pages with unique urls. I have had no problems here with duplicate content and that sort of thing and all pages were manually translated so no translation issues. We have been looking at software that can dynamically translate the complete site into a handfull of languages...lets say about 5. My problem here is these pages get produced dynamically and i have concerns that google will take issue with this aswell as the huge sudden influx of new urls....as now we could be looking at and increase of 5000 new urls. (which usually triggers an alarm) My feeling is that it could be risking the stability of the site that we have worked so hard for and maybe just stick with the already translated static pages. I am sure the process could be fine but fear a manual inspection and a slap on the wrist for having dynamically created content?? and also just risk a review trigger period. These days it is hard to know what could get you in "trouble" and my gut says keep it simple and as is and dont shake it up?? Am i being overly concerned? Would love to here from others who have tried similar changes and also those who have not due to similar "fear" thanks
Technical SEO | | nomad-2023230 -
Duplicate Home Page Fix Doesnt Work :-(
Good Afternoon from 16 degrees C 85% Humidity Wetherby UK... Having ran a screaming from report for www.davidclick.com the tool diagnosed there was two versions of the home page and here are the 2 urls for the home page: www.davidclick.com
Technical SEO | | Nightwing
www.davidclick.com/index.htm I leaped into action and added this line of code into the head section of www.davidclick.com/index.htm and hers it is:http://i216.photobucket.com/albums/cc53/zymurgy_bucket/canonical-code.jpg But why when i run a screaming frog report its still telling me there are two versions of thye home page. Any insights welcome 🙂0 -
Magento and Duplicate content
I have been working with Magento over the last few weeks and I am becoming increasingly frustrated with the way it is setup. If you go to a product page and remove the sub folders one by one you can reach the same product pages causing duplicate content. All magento sites seem to have this weakness. So use this site as an example because I know it is built on magento, http://www.gio-goi.com/men/clothing/tees/throve-t-short.html?cid=756 As you remove the tees then the clothing and men sub folders you can still reach the product page. My first querstion is how big an issue is this and two does anyone have any ideas of how to solve it? Also I was wondering how does google treat question marks in urls? Should you try and avoid them unless you are filtering? Thanks
Technical SEO | | gregster10001