How to remove Duplicate content due to url parameters from SEOMoz Crawl Diagnostics
-
Hello all
I'm currently getting back over 8000 crawl errors for duplicate content pages . Its a joomla site with virtuemart and 95% of the errors are for parameters in the url that the customer can use to filter products.
Google is handling them fine under webmaster tools parameters but its pretty hard to find the other duplicate content issues in SEOMoz with all of these in the way.
All of the problem parameters start with
?product_type_
Should i try and use the robot.txt to stop them from being crawled and if so what would be the best way to include them in the robot.txt
Any help greatly appreciated.
-
Hi Tom
It took a while but I got there in the end. I was using joomla 1.5 and I downloaded a component called "tag meta" which allows you to insert tags including the canonical tag on specific urls or more importantly urls which begin in a certain way. Now how you use it depends on how your sef urls are set up or what sef component you are using but you can put a canonical tag on every url in a section that has view-all-products in it.
So in one of my examples I put a canonical tag pointing to /maternity-tops.html (my main category page for that section) on every url that began with /maternity-tops/view-all-products
I hope this if of help to you. It takes a bit of playing around with but it worked for me. The component also has fairly good documentation.
Regards
Damien
-
Damien,
Are you able to explain how you were able to do this within virtuemart?
Thanks
Tom
-
So leave the 5 pages of dresses as they are because they are all original but have the canonical tag on all of the filter parameters pointing to Page 1 of dresses.
Thank you for your help Alan
-
It should be on all versions of the page, all pointing to the one version.
Search engines will then see all as one page
-
Hi Alan
Thanks for getting back to me so fast. I'm slightly confused on this so an example might help One of the pages is http://www.funkybumpmaternity.com/Maternity-Dresses.html.
There are 5 pages of dresses with options on the left allowing you to narrow that down by color, brand, occasion and style. Every time you select an option on combination of options on the left for example red it will generate a page with only red dresses and a url of http://www.funkybumpmaternity.com/Maternity-Dresses/View-all-products.html?product_type_1_Colour[0]=Red&product_type_1_Colour_comp=find_in_set_any&product_type_id=1
The options available are huge which I believe is why i'm getting so many duplicate content content issues on SEOMoz pro. Google is handling the parameters fine.
How should I implement the canonical tag? Should I have a tag on all filter pages referencing page 1 of the dresses? Should pages 2-5 have the tag on them? If so would this mean that the dresses on these pages would not be indexed?
-
This sounds more like a case for a canonical tag,
dont exculed with robots.txt this is akin to cutting off your arm, because you have a spliter in your finger.
When you exclude use robots, link juce passing though links to these pages is lost.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate Content
Hello, I'm managing a site which shows as having duplicate page issues (in the crawl analyser) for 3 pages. Basically the site is offering 3 different options of the same product so depending on which size you select, you are directed to the relevant page. These 3 pages are basically identical apart from a slight difference in copy regarding the size (small, medium, large) Is this likely to be a big issue regarding SEO, and what would the moz community suggest re this? Thank you!
Moz Pro | | wearehappymedia0 -
Crawl Diagnostics saids a page is linking but I can't find the link on the page.
Hi I have just got my first Crawl Diagnostics report and I have a questions. It saids that this page: http://goo.gl/8py9wj links to http://goo.gl/Uc7qKq which is a 404. I can't recognize the URL on the page which is a 404 and when searching in the code I can't find the %7Blink%7D in the URL which gives the problems. I hope you can help me to understand what triggers it 🙂
Moz Pro | | SebastianThode0 -
Does SEOmoz have a Keyword Research tool similar to, say, the Google AdWords tool or the WebCEO Keyword Research Tool? And where might that be? (Sorry, I'm very new to SEOmoz Pro.)
I'm looking for an SEOmoz version of the classic WebCEO Keyword Research that would give you effective suggestions based on a keyword inquiry. I've made the switch from WebCEO, but I'm trying to find something similar to that Keyword Research tool. Am I going to just need to use the Google AdWords tool for this function or does SEOmoz have it's own version?
Moz Pro | | SmokewagonKen0 -
Crawl Stats Have Dissapeared
Hi SEOmoz I received an email today that another scan has been performed but when I log into my account all the tracking details have disappeared? States Pages crawled N/A. Can someone please help? Temporary problem? Website www.vintageheirloom.com Thanks
Moz Pro | | well-its-1-louder0 -
Where does the crawler find the urls?
The SEO Moz crawler has found a number of 500 error pages, and 404s etc which is very useful 🙂 however some of the urls are weird/broken formats we don't recognise and nobody remembers ever using - not weird enough to imply hacking, but something broken in the CMS Is there anyway to find out where the crawler found these urls? I can patch up and redirect the end result as best I can but I would prefer to fix plug the leak thanks 🙂
Moz Pro | | Fammy1 -
I have a Rel Canonical "notice" in my Crawl Diagnostics report. I'm presuming that means that the spider has detected a rel canonical tag and it is working as opposed to warning about an issue, is this correct?
I know this seems like a really dumb question but the site I'm working on is a BigCommerce one and I've been concerned about canonicalisation issues prior to receiving this report (I'm a SEOmoz pro newbie also!) and I just want to be clear I am reading this notice correctly. I presume this means that the site crawl has detected the rel canonical tag on these pages and it is working correctly. Is this correct?? Any input is much appreciated. Thanks
Moz Pro | | seanpearse0 -
How long does the seomoz crawl take?
It's been doing it's thing for over 48 hours now and Ive got less than 350 pages... is this norma? It's NOT the first crawl.
Moz Pro | | borderbound0 -
We were unable to grade that page. We received a response code of 301\. URL content not parseable
I am using seomoz webapp tool for my SEO on my site. I have run into this issue. Please see the attached file as it has the screen scrape of the error. I am running an on page scan from seomoz for the following url: http://www.racquetsource.com/squash-racquets-s/95.htm When I run the scan I receive the following error: We were unable to grade that page. We received a response code of 301. URL content not parseable. This page had worked previously. I have tried to verify my 301 redirects and am unable to resolve this error. I can perform other on page scans and they work fine. Is this a known problem with this tool? I have verified ensuring I don't have it defined. Any help would be appreciated.
Moz Pro | | GeoffBatterham0