Drupal infinite URL depth? SEOMOZ treating as duplicate content
-
I'm monitoring a subdirectory of my site on SEOMOZ but with catastrophic results. It's finding infinite duplicate content e.g.www.example.co.uk/product/samples/product/product/productand so on...
The website is running on Drupal. Do you have any ideas on how I can solve this?
-
I'm having this same issue with a new drupal site. Does anyone know the underlying cause and how to fix it.
Would any relative path cause this?
Thanks.
-
Can you list the modules you're running? What e-Commerce module are you running?
-
I'm not a Drupal expert, but it sounds like you may have some kind of relative path that's getting perpetuated. Robots.txt could help as a patch, but I'd definitely want to solve the crawl problem, as this could spin out into other problems.
Have you tried a desktop crawler, like Xenu or Screaming Frog? Sorry, it's tough to diagnose without seeing the actual site, but it's almost got to be a relative path that's causing "/product" to keep being added to links.
-
Yes, anything deeper would also be blocked.
-
Thanks Scott, this is really helpful.
Out of interest, would disallowing '/product/samples/product' automatically stop the bots from indexing all the pages underneath this, too such as '/product/samples/product/product/product/'?
-
Try adding something like this to your robots.txt file:
User-agent: rogerbot
Disallow: /product/samples/product/
Disallow: /product/samples2/product1/
Disallow: /product/samples3/product4/etc...
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Subdomain Severe Duplicate Content Issue
Hi A subdomain for our admin site has been indexed and it has caused over 2000 instances of duplicate content. To fix this issue, is a 301 redirect or canoncial tag the best option? http://www.example.com/services http://admin.example.com/services Really appreciate your advice J
Technical SEO | | Metricly-Marketing0 -
Duplicate blog URLs in Magenton
On one my sites Moz is picking up 4483 duplicate content pages. The majority of these are from our blog and video sections on our site. We're using a URL shortener and it appears that some of the pages are the full version of the URL then the shortened version. However if you go to the full version you get redirected to the shorter one. So I would assume that the Moz crawler should get the same redirect? We're also getting pagination being shown as duplicate pages, which I would half expect, but the URLs Magento is creating are truly bizarre: e.g http://www.xxx.com/uk/blog/cat/view/identifier/news/page/news/index.php/alarms-doorbells/?p=2 Alarms and doorbells is one of our product categories, which is displayed in the LHN on the blog page but has nothing to do with the blog itself. On another site on the same Magento instance, with the same content (they're for two different regions) we're show as having 248 duplicate pages, again in the video and news section, but this is a completely different scale of issue. Has anyone else encountered issues like these? I'm probably going to put a noindex in place on these two sections until we can get a solution in place as we're completely unranked in google on this site. Thanks
Technical SEO | | ahyde0 -
174 Duplicate Content Errors
How do I go about fixing these errors? There are all related to my tags. Thank you in advance for any help! Lisa
Technical SEO | | lisarein0 -
Duplicate Content in Dot Net Nuke
Our site is built on Dot Net Nuke. SEOmoz shows a very large amount of duplicate content because at the beginning each page got an extension in the following format: www.domain.com/tabid/110/Default.aspx The site additionally exists without the tabid... part. Our web developer says an easy fix with a canonical tag or 301 redirect is not possible. Does anyone have DNN experience and can point us in the right direction? Thanks, Ricarda
Technical SEO | | jsillay0 -
Duplicate Content in Wordpress.com
Hi Mozers! I have a client with a blog on wordpress.com. http://newsfromtshirts.wordpress.com/ It just had a ranking drop because of a new Panda Update, and I know it's a Dupe Content problem. There are 3900 duplicate pages, basically because there is no use of noindex or canonical tag, so archives, categories pages are totally indexed by Google. If I could install my usual SEO plugin, that would be a piece of cake, but since Wordpress.com is a closed environment I can't. How can I put a noindex into all category, archive and author peges in wordpress.com? I think this could be done by writing a nice robot.txt, but I am not sure about the syntax I shoud use to achieve that. Thank you very much, DoMiSol Rossini
Technical SEO | | DoMiSoL0 -
How to get rid of duplicate content
I have duplicate content that looks like http://deceptionbytes.com/component/mailto/?tmpl=component&link=932fea0640143bf08fe157d3570792a56dcc1284 - however I have 50 of these all with different numbers on the end. Does this affect the search engine optimization and how can I disallow this in my robots.txt file?
Technical SEO | | Mishelm1 -
Crawl reveals hundreds of urls with multiple urls in the url string
The latest crawl of my site revealed hundreds of duplicate page content and duplicate page title errors. When I looked it was from a large number of urls with urls appended to them at the end. For example: http://www.test-site.com/page1.html/page14.html or http://www.test-site.com/page4.html/page12.html/page16.html some of them go on for a hundred characters. I am totally stymied, as are the people at my ISP and the person who talked to me on the phone from SEOMoz. Does anyone know what's going on? Thanks So much for any help you can offer! Jean
Technical SEO | | JeanYates0 -
I have a ton of "duplicated content", "duplicated titles" in my website, solutions?
hi and thanks in advance, I have a Jomsocial site with 1000 users it is highly customized and as a result of the customization we did some of the pages have 5 or more different types of URLS pointing to the same page. Google has indexed 16.000 links already and the cowling report show a lot of duplicated content. this links are important for some of the functionality and are dynamically created and will continue growing, my developers offered my to create rules in robots file so a big part of this links don't get indexed but Google webmaster tools post says the following: "Google no longer recommends blocking crawler access to duplicate content on your website, whether with a robots.txt file or other methods. If search engines can't crawl pages with duplicate content, they can't automatically detect that these URLs point to the same content and will therefore effectively have to treat them as separate, unique pages. A better solution is to allow search engines to crawl these URLs, but mark them as duplicates by using the rel="canonical" link element, the URL parameter handling tool, or 301 redirects. In cases where duplicate content leads to us crawling too much of your website, you can also adjust the crawl rate setting in Webmaster Tools." here is an example of the links: | | http://anxietysocialnet.com/profile/edit-profile/salocharly http://anxietysocialnet.com/salocharly/profile http://anxietysocialnet.com/profile/preferences/salocharly http://anxietysocialnet.com/profile/salocharly http://anxietysocialnet.com/profile/privacy/salocharly http://anxietysocialnet.com/profile/edit-details/salocharly http://anxietysocialnet.com/profile/change-profile-picture/salocharly | | so the question is, is this really that bad?? what are my options? it is really a good solution to set rules in robots so big chunks of the site don't get indexed? is there any other way i can resolve this? Thanks again! Salo
Technical SEO | | Salocharly0