Duplicate Page content | What to do?
-
Hello Guys,
I have some duplicate pages detected by MOZ. Most of the URL´s are from a registracion process for users, so the URL´s are all like this:
www.exemple.com/user/login?destination=node/125%23comment-form
What should I do? Add this to robot txt? If so how? Whats the command to add in Google Webmaster?
Thanks in advance!
Pedro Pereira
-
Hi Carly,
It needs to be done to each of the pages. In most cases, this is just a minor change to a single page template. Someone might tell you that you can add an entry to robots.txt to solve the problem, but that won't remove them from the index.
Looking at the links you provided, I'm not convinced you should deindex them all - as these are member profile pages which might have some value in terms of driving organic traffic and having unique content on them. That said I'm not party to how your site works, so this is just an observation.
Hope that helps,
George
-
Hi George,
I am having a similar issue with my site, and was looking for a quick clarification.
We have several "member" pages that have been created as a part of registration (thousands) and they are appearing as duplicate content. When you say add noindex and and a canonical, is this something that needs to be done to every individual page or is there something that can be done that would apply to the thousands of pages at once?
Here are a couple of examples of what the pages look like:
http://loyalty360.org/me/members/8003
http://loyalty360.org/me/members/4641
Thank you!
-
1. If you add just noindex, Google will crawl the page, drop it from the index but it will also crawl the links on that page and potentially index them too. It basically passes equity to links on the page.
2. If you add nofollow, noindex, Google will crawl the page, drop it from the index but it will not crawl the links on that page. So no equity will be passed to them. As already established, Google may still put these links in the index, but it will display the standard "blocked" message for the page description.
If the links are internal, there's no harm in them being followed unless you're opening up the crawl to expose tons of duplicate content that isn't canonicalised.
noindex is often used with nofollow, but sometimes this is simply due to a misunderstanding of what impact they each have.
George
-
Hello,
Thanks for your response. I have learn more which is great
My question is should I add a noindex only to that page or a noidex, nofolow?
Thanks!
-
Yes it's the worst possible scenario that they basically get trapped in SERPs. Google won't then crawl them until you allow the crawling, then set noindex (to remove from SERPS) and then add nofollow,noindex back on to keep them out of SERPs and to stop Google following any links on them.
Configuring URL parameters again is just a directive regarding the crawl and doesn't affect indexing status to the best of my knowledge.
In my experience, noindex is bulletproof but nofollow / robots.txt is very often misunderstood and can lead to a lot of problems as a result. Some SEOs think they can be clever in crafting the flow of PageRank through a site. The unsurprising reality is that Google just does what it wants.
George
-
Hi George,
Thanks for this, It's very interesting... the urls do appear in search results but their descriptions are blocked(!)
Did you try configuring URL parameters in WMT as a solution?
-
Hi Rafal,
The key part of that statement is "we might still find and index information about disallowed URLs...". If you read the next sentence it says: "As a result, the URL address and, potentially, other publicly available information such as anchor text in links to the site can still appear in Google search results".
If you look at moz.com/robots.txt you'll see an entry for:
Disallow: /pages/search_results*
But if you search this on Google:
site:moz.com/pages/search_results
You'll find there are 20 results in the index.
I used to agree with you, until I found out the hard way that if Google finds a link, regardless of whether it's in robots.txt or not it can put it in the index and it will remain there until you remove the nofollow restriction and noindex it, or remove it from the index using webmaster tools.
George
-
George,
I went to check with Google to make sure I am correct and I am!
"While Google won't crawl or index the content blocked by
robots.txt
, we might still find and index information about disallowed URLs from other places on the web." Source: https://support.google.com/webmasters/answer/6062608?hl=enYes, he can fix these problems on page but disallowing it in robots will work fine too!
-
Just adding this to robots.txt will not stop the pages being indexed:
Disallow: /*login?
It just means Google won't crawl the links on that page.
I would do one of the following:
1. Add noindex to the page. PR will still be passed to the page but they will no longer appear in SERPs.
2. Add a canonical on the page to: "www.exemple.com/user/login"
You're never going to try and get these pages to rank, so although it's worth fixing I wouldn't lose too much sleep on the impact of having duplicate content on registration pages (unless there are hundreds of them!).
Regards,
George
-
In GWT: Crawl=> URL Parameters => Configure URL Parameters => Add Parameter
Make sure you know what you are doing as it's easy to mess up and have BIG issues.
-
Add this line to your robots.txt to prevent google from indexing these pages:
Disallow: /*login?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicated page titles
Dear friends, We have a problem which occurs on our page with duplicated page titles. Landlords are posting rooms on our page and most of them are giving the same name to the rooms and after that, we are getting more and more duplicated page titles. We are applying random whit this title tag: Accommodation for students in {city.name}: {name}. English title
On-Page Optimization | | Eurasmus.com
Certified student rooms in {city.name}: {name} English title
Erasmus room for students in {city.name} | {name} English title
Student room in {city.name}: {name} English title Also our title tag is sometimes to long but there is no possibility to make them shorter. I think. If anyone would have some idea be free to comment and help us. Kind regards Miško Macolić Tomičić0 -
Duplicate Content
I have a question about duplicate content. (auto generated text).
On-Page Optimization | | affigroup
Will google consider page 1 and page 2 as duplicate content? Page 1. You will find all the Amazon coupon codes and Amazon discount codes currently available listed below, if Amazon doesn't currently have any coupons available you may want to check for Amazon deals or find related coupon codes or promotional codes for similar online stores selling the same products as amazon.
We always have the latest coupon codes for Amazon which are updated daily, so if you can't find any Amazon coupons here then you won't find them anywhere else.
Shop online today at Amazon, and take advantage of the coupon codes that Amazon currently has on offer, these coupon codes, offer codes, and promo codes for Amazon may never be available again. Page 2. You will find all the Target coupon codes and Target discount codes currently available listed below, if Target doesn't currently have any coupons available you may want to check for Target deals or find related coupon codes or promotional codes for similar online stores selling the same products as Target.
We always have the latest coupon codes for Target which are updated daily, so if you can't find any Target coupons here then you won't find them anywhere else.
Shop online today at Target, and take advantage of the coupon codes that Target currently has on offer, these coupon codes, offer codes, and promo codes for Target may never be available again.0 -
Does this site have a duplicate content issue?
Google WMT is showing me only 2 short meta descriptions under "HTML Improvements" but I believe http://www.customgia.com may have a content duplication issue. Numerous keywords are used repeatedly across many product descriptions. To make matters worse, every product page has a "Design It!" button that sends the user to a flash-based jewelry designer in which they can edit the product's appearance. I'm not sure if these "designer pages" are adding unnecessary and potentially damaging duplicate content but it's certainly a possibility. There are many items on this site that are similar to one another but not the same. The product description tend to use the same phrases over and over again - words like crystal, Swarovski, beaded, design it, customize, change, pearl, glass beads, iridescent, pearl, drop earrings are used a lot. What I'm stuck on is whether or not I should be focusing on a content duplication issue as the primary SEO problem or if there is something bigger. Thank you for any assistance you can provide!
On-Page Optimization | | rja2140 -
Ranked page is not desired page
I have a question on a problem I am currently faced with. There is a certain keyword that my employer wants to rank for. The good news is that sometimes it does rank in the top 5 pages of Google. (It drops in and out) The bad news is that it is going to a page that we need to keep, but not the ideal place we want people who are looking for that keyword to go to. I was wondering if anyone has had any experience with this type of situation and what tactic they used to get people to the better page.
On-Page Optimization | | trumpfinc1 -
Old landing page modifications - should I change the content?
One of our most popular landing page is starting to be a little bit out dated, should I keep the old content and update with newer text or is it safe to completely replace the old content with the new content without losing our organic traffic on this page?
On-Page Optimization | | rusted880 -
Should I worry about duplicate titles on pages where there is paginated content?
LivingThere.com is a real estate search site and many of our content pages are "search result" - ish in that a page often provides all the listings that are available and this may go on for multiple pages. For example, this is a primary page about a building: http://livingthere.com/building/31308-Cocoa-Exchange Because of the number of listings, the listings paginate to a second page: http://livingthere.com/building/31308-Cocoa-Exchange?MListings_page=2 Both pages have the same Page Title. Is this a concern? If so is there a "best practice" for giving paginated content different titles? Thanks! Nate
On-Page Optimization | | nate1230 -
Duplicat page content issue I don't know how to solve
I've got a few pages (click here to see the fist on with the others as side bar links). They are all thumbnail pages of different products. The tiles are pretty different but the page content is virtually the same for all of them as is the meta description tag. I'm getting error's on the SEOmoz crawl for those pages. I know the meta tag shouldn't be a problem in SEO but is the content of the page going to cause me issues? Are the error messages from SEOmoz a result of the page content or the meta description? The pages are very similar but they are different enough that I want to separate them onto different pages. There would be too many links on that single page as well if all the thumbs where on the same page. Should I just ignore the error messages?
On-Page Optimization | | JAARON0 -
Duplicate Content
What I can do to avoid the duplicate content on the index and in the categorys, I cant block my categorys, cause are pages with big autorithy, so what i can do ?
On-Page Optimization | | nafera20