Drupal infinite URL depth? SEOMOZ treating as duplicate content
-
I'm monitoring a subdirectory of my site on SEOMOZ but with catastrophic results. It's finding infinite duplicate content e.g.www.example.co.uk/product/samples/product/product/productand so on...
The website is running on Drupal. Do you have any ideas on how I can solve this?
-
I'm having this same issue with a new drupal site. Does anyone know the underlying cause and how to fix it.
Would any relative path cause this?
Thanks.
-
Can you list the modules you're running? What e-Commerce module are you running?
-
I'm not a Drupal expert, but it sounds like you may have some kind of relative path that's getting perpetuated. Robots.txt could help as a patch, but I'd definitely want to solve the crawl problem, as this could spin out into other problems.
Have you tried a desktop crawler, like Xenu or Screaming Frog? Sorry, it's tough to diagnose without seeing the actual site, but it's almost got to be a relative path that's causing "/product" to keep being added to links.
-
Yes, anything deeper would also be blocked.
-
Thanks Scott, this is really helpful.
Out of interest, would disallowing '/product/samples/product' automatically stop the bots from indexing all the pages underneath this, too such as '/product/samples/product/product/product/'?
-
Try adding something like this to your robots.txt file:
User-agent: rogerbot
Disallow: /product/samples/product/
Disallow: /product/samples2/product1/
Disallow: /product/samples3/product4/etc...
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How to avoid duplicate content
Hi, I have a website which is ranking on page 1: www.oldname.com/landing-page But because of legal reason i had to change the name.
Technical SEO | | mikehenze
So i moved the landing page to a different domain.
And 301'ed this landing page to the new domain (and removed all products). www.newname.com/landing-page All the meta data, titles, products are still the same. www.oldname.com/landing-page is still on the same position
And www.newname.com/landing-page was on page 1 for 1 day and is now on page 4. What did i do wrong and how can I fix this?
Maybe remove www.oldname.com/landing-page from Google with Google Webmaster Central or not allow crawling of this page with .htaccess ?0 -
WordPress Duplicate Content Caused By Categories
Hello, We have a wordpress blog that has around 250 categories. Due to our platform we have a hierarchy structure for 3 separate stores. For example iPhone > Apps > Books. Placing a blog post in the books category automatically places it into iPhone and iPhone/Apps category, causing 3 instances of any blog post in this category. Is this an issue? I have seen 2 schools of thought on categories, 1 index follow and 2 noindex follow. I know some of our categories get indexed, but with so many, maybe it is better to noindex them. We also considered reducing our categories to 10 to 12 and use tags to provide the indexed site navigation as follows: Reviews (category) iPhone Book App, iPhone App Store (tags) but this seems a little redundant? Anyone want to take this on? thank you Mike
Technical SEO | | crazymikesapps10 -
Would Google Call These Pages Duplicate Content?
Our Web store, http://www.audiobooksonline.com/index.html, has struggled with duplicate content issues for some time. One aspect of duplicate content is a page like this: http://www.audiobooksonline.com/out-of-publication-audio-books-book-audiobook-audiobooks.html. When an audio book title goes out-of-publication we keep the page at our store and display a http://www.audiobooksonline.com/out-of-publication-audio-books-book-audiobook-audiobooks.html whenever a visitor attempts to visit a specific title that is OOP. There are several thousand OOP pages. Would Google consider these OOP pages duplicate content?
Technical SEO | | lbohen0 -
How to solve Parameter Issue causing Duplicate Content
Hi everyone, My site home page comes up in SERP with following url www.sitename/?referer=indiagrid My question is:- Should I disallow using robots.txt.? or 301 redirect to the home page Other issue is i have few dynamic generated URL's for a form http://www.www.sitename/career-form.php?position=SEO Executive I am using parameter "position" in URL Parameter in GWT. But still my pages are indexed that is leading to duplicate page content. Please help me out.
Technical SEO | | himanshu3019890 -
SEOMOZ and non-duplicate duplicate content
Hi all, Looking through the lovely SEOMOZ report, by far its biggest complaint is that of perceived duplicate content. Its hard to avoid given the nature of eCommerce sites that oestensibly list products in a consistent framework. Most advice about duplicate content is about canonicalisation, but thats not really relevant when you have two different products being perceived as the same. Thing is, I might have ignored it but google ignores about 40% of our site map for I suspect the same reason. Basically I dont want us to appear "Spammy". Actually we do go to a lot of time to photograph and put a little flavour text for each product (in progress). I guess my question is, that given over 700 products, why 300ish of them would be considered duplicates and the remaning not? Here is a URL and one of its "duplicates" according to the SEOMOZ report: http://www.1010direct.com/DGV-DD1165-970-53/details.aspx
Technical SEO | | fretts
http://www.1010direct.com/TDV-019-GOLD-50/details.aspx Thanks for any help people0 -
Duplicate Content Vs No Content
Hello! A question that has been throw around a lot at our company has been "Is duplicate content better than no content?". We operate a range of online flash game sites, most of which pull their games from a feed, which includes the game description. We have unique content written on the home page of the website, but aside from that, the game descriptions are the only text content on the website. We have been hit by both Panda and Penguin, and are in the process of trying to recover from both. In this effort we are trying to decide whether to remove or keep the game descriptions. I figured the best way to settle the issue would be to ask here. I understand the best solution would be to replace the descriptions with unique content, however, that is a massive task when you've got thousands of games. So if you have to choose between duplicate or no content, which is better for SEO? Thanks!
Technical SEO | | Ryan_Phillips0 -
How can something be duplicate content of itself?
Just got the new crawl report, and I have a recurring issue that comes back around every month or so, which is that a bunch of pages are reported as duplicate content for themselves. Literally the same URL: http://awesomewidgetworld.com/promotions.shtml is reporting that http://awesomewidgetworld.com/promotions.shtml is both a duplicate title, and duplicate content. Well, I would hope so! It's the same URL! Is this a crawl error? Is it a site error? Has anyone seen this before? Do I need to give more information? P.S. awesomewidgetworld is not the actual site name.
Technical SEO | | BetAmerica0 -
Duplicate content issue
Hi everyone, I have an issue determining what type of duplicate content I have. www.example.com/index.php?mact=Calendar,m57663,default,1&m57663return_id=116&m57663detailpage=&m57663year=2011&m57663month=6&m57663day=19&m57663display=list&m57663return_link=1&m57663detail=1&m57663lang=en_GB&m57663returnid=116&page=116 Since I am not an coding expert, to me it looks like it is a URL parameter duplicate content. Is it? At the same time "return_id" would makes me think it is a session id duplicate content. I am confused about how to determine different types of duplicate content, even by reading articles on Seomoz about it: http://www.seomoz.org/learn-seo/duplicate-content. Could someone help me on how to recognize different types of duplicate content? Thank you!
Technical SEO | | Ideas-Money-Art0