150 Duplicate page error
-
I am told that I have 150 duplicate page content. It seems that it is the login link on each of my pages. Is this an error? Is it something I have to change?
Thanks
Login/Register at
http://irishdancingdress.com/wp-login.php?redirect_to=http%3A%2F%2Firishdancingdress.com%2Fdress
-
This one's a bit weird - your main "Login" link is fine - this is happening down in the comments section (under "Leave a Reply") - that login link tags the source page, so that you can return to the post.
In this case, I think I'd actually nofollow that and it's probably fine to block it in Robots.txt. This is where things get really situational, as normally I'd advise against that - see my recent post:
http://www.seomoz.org/blog/logic-meet-google-crawling-to-deindex
In your situation, though, Google only seems to be indexing 2 of those URLs currently, so you can probably cut this off before it becomes a problem. Our crawler is being a bit more aggressive in this situation (and, honestly, these links could pose a problem long-term).
If you had a ton of these pages indexed, I'd agree with Slava and recommend rel-canonical, because Robots.txt is pretty ineffective for de-indexing (plus, nofollow causes the problem in my post).
Sorry, I'm making this clear as mud I think a nofollow and blocking are fine here, because basically the problem hasn't happened yet - you're trying to prevent future problems. You could also monitor for these URLs in Google's index for a few weeks, using this command:
site:irishdancingdress.com/wp-login.php
...if that number stays low (it's currently 2), then you're good to go.
-
Keith,I think the only way to stop Roger and google from indexing those pages is to put them in the robots.txt file
I made some things global, but Roger seemed to ignore those, so I gave him his own section.
Just modify these to suit your setup.
User-agent: *
Disallow: /tag/*
Disallow: /wp-login.php*User-agent: rogerbot
Disallow: /tag/*
Disallow: /wp-login.php* -
Rel Canonical may not be what you need here.
First question you need to ask yourself is the login page something that needs indexed by Search Engines? If the answer is no, block it with your robots.txt then use -> rel="nofollow" on your login links.
If you have a reason for your login page to be indexed then you'll need to use the meta rel-canonical tag to point to the absolute root of the page.. based on your URL I would assume it is "http://irishdancingdress.com/wp-login.php"
Hope that helps
-
Do you use rel=canonical meta tag? I think if you use it, it will solve your problem.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Why are my 301 redirects and duplicate pages (with canonicals) still showing up as duplicates in Webmaster Tools?
My guess is that in time Google will realize that my duplicate content is not actually duplicate content, but in the meantime I'd like to get your guys feedback. The reporting in Webmaster Tools looks something like this. Duplicates /url1.html /url2.html /url3.html /category/product/url.html /category2/product/url.html url3.html is the true canonical page in the list above._ url1.html,_ and url2.html are old URLs that 301 to url3.html. So, it seems my bases are covered there. _/category/product/url.html _and _/category2/product/url.html _ do not redirect. They are the same page as url3.html. Each of the category URLs has a canonical URL of url3.html in the header. So, it seems my bases are covered there as well. Can I expect Google to pick up on this? Why wouldn't it understand this already?
Technical SEO | | bearpaw0 -
Partially duplicated content on separate pages
TL;DR: I am writing copy for some web pages. I am duplicating some bits of copy exactly on separate web pages. And in other cases I am using the same bits of copy with slight alterations. Is this bad for SEO? Details: We sell about 10 different courses. Each has a separate page. I'm currently writing copy for those pages. Some of the details identical for each course. So I can duplicate the content and it will be 100% applicable. For example, when we talk about where we can run courses (we go to a company and run it on their premises) – that's applicable to every course. Other bits are applicable with minor alterations. So where we talk about how we'll tailor the course, I will say for example: "We will the tailor the course to the {technical documents|customer letters|reports} your company writes." Or where we have testimonials, the headline reads "Improving {customer writing|reports|technical documents} in every sector and industry". There is original content on each page. The duplicate stuff may seem spammy, but the alternative is me finding alternative re-wordings for exactly the same information. This is tedious and time-consuming and bizarre given that the user won't notice any difference. Do I need to go ahead and re-write these bits ten slightly different ways anyway?
Technical SEO | | JacobFunnell0 -
Results Pages Duplication - What to do?
Hi all, I run a large, well established hotel site which fills a specific niche. Last February we went through a redesign which implemented pagination and lots of PHP / SQL wizzardy. This has left us, however, with a bit of a duplication problem which I'll try my best to explain! Imagine Hotel 1 has a pool, as well as a hot tub. This means that Hotel 1 will be in the search results of both 'Hotels with Pools' and 'Hotels with Hot Tubs', with exactly the same copy, affiliate link and thumbnail picture in the search results. Now imagine this issue occurring hundreds of times across the site and you have our problem, especially since this is a Panda-hit site. We've tried to keep any duplicate content away from our landing pages with some success but it's just all those pesky PHP paginated pages which doing us in (e.g. Hotels/Page-2/?classifications[]263=73491&classifcations[]742=24742 and so on) I'm thinking that we should either a) completely noindex all of the PHP search results or b) move us over to a Javascript platform. Which would you guys recommend? Or is there another solution which I'm overlooking? Any help most appreciated!
Technical SEO | | dooberry0 -
Container Page/Content Page Duplicate Content
My client has a container page on their website, they are using SiteFinity, so it is called a "group page", in which individual pages appear and can be scrolled through. When link are followed, they first lead to the group page URL, in which the first content page is shown. However, when navigating through the content pages, the URL changes. When navigating BACK to the first content page, the URL is that for the content page, but it appears to indexers as a duplicate of the group page, that is, the URL that appeared when first linking to the group page. The client updates this on the regular, so I need to find a solution that will allow them to add more pages, the new one always becoming the top page, without requiring extra coding. For instance, I had considered integrating REL=NEXT and REL=PREV, but they aren't going to keep that up to date.
Technical SEO | | SpokeHQ1 -
Duplicate page errors from pages don't even exist
Hi, I am having this issue within SEOmoz's Crawl Diagnosis report. There are a lot of crawl errors happening with pages don't even exist. My website has around 40-50 pages but SEO report shows that 375 pages have been crawled. My guess is that the errors have something to do with my recent htaccess configuration. I recently configured my htaccess to add trailing slash at the end of URLs. There is no internal linking issue such as infinite loop when navigating the website but the looping is reported in the SEOmoz's report. Here is an example of a reported link: http://www.mywebsite.com/Door/Doors/GlassNow-Services/GlassNow-Services/Glass-Compliance-Audit/GlassNow-Services/GlassNow-Services/Glass-Compliance-Audit/ btw there is no issue such as crawl error in my Google webmaster tool. Any help appreciated
Technical SEO | | mmoezzi0 -
CMS on autopilot is happily creating duplicate pages - advice?
Hi, our ecommerce CMS (Magento) is creating a bunch of pages with very little content and no user value like this: http://goo.gl/UU2vl This particular example is the by product of a product filtering page, which has the format www.mywebsite/explore/index/loaddata/id/10/. These pages have no content other than images - also the pages don't have page titles and are therefore being flagged in webmaster tools as requiring HTML improvements We also have CMS auto generated pages like this: www.mysite.comhttp/review/product/list/id/7 where the page is effectively a duplicate of the product page, and this is giving us pages being flagged by webmastertools as having duplicate title tags. Should we exclude these two type of page via robots.txt or take another approach, like not worry about them 🙂 many thanks, any help gratefully received.
Technical SEO | | w1ll1am0 -
Home page indexed but not ranking...interior pages with thin content outrank home page??
I have a Joomla site with a home page that I can't get to rank for anything beyond the company name @ Google - the site works fine @ Bing and Yahoo. The interior pages will rank all day long but the home page never shows up in the results. I have checked the page code out in every tool that I know about and have had no luck....by all account it should be good to go...any thoughts/comments/help would be greatly appreciated. The site is http://www.selectivedesigns.com Thanks! Greg
Technical SEO | | DougHosmer0 -
Duplicate Page content / Rel=Cannonical
My SEO Moz crawl is showing duplicate content on my site. What is showing up are two articles I submitted to Submit your article (article submission service). I put their code in to my pages i.e. " <noscript><b>This article will only display in JavaScript enabled browsers.</b></noscript> " So do I need to delete these blog posts since they are showing up as dup content? I am having a difficult time understanding rel=cannonical. Isn't this for dup content on within one site? So I could not use rel="cannonical" in this instance? What is the best way to feature an article or press release written for another site, but that you want your clients to see? Rewritting seem ridiculous for a small business like ours. Can we just present the link? Thank you.
Technical SEO | | RoxBrock0