Strange duplicate content issue
-
Hi there,
SEOmoz crawler has identified a set of duplicate content that we are struggling to resolve.
For example, the crawler picked up that this page www. creative - choices.co.uk/industry-insight/article/Advice-for-a-freelance-career is a duplicate of this page www. creative - choices.co.uk/develop-your-career/article/Advice-for-a-freelance-career.
The latter page's content is the original and can be found in the CMS admin area whilst the former page is the duplicate and has no entry in the CMS. So we don't know where to begin if the "duplicate" page doesn't exist in the CMS.
The crawler states that this page www. creative-choices.co.uk/industry-insight/inside/creative-writing is the referrer page. Looking at it, only the original page's link is showing on the referrer page, so how did the crawler get to the duplicate page?
-
it could be any one out of the following 3 scenarios.
1: The page in question was moved at some point and since the CMS still accepts the old URL, when google re-visits the old URL it still finds it. So in this scenario it will find both the old URL and the new URL and index both.
2: google hasn't revisited the page for a long while but it is still in it's index, even though it would get a 301 by the CMS when it visits the page. Can be easily fixed by going to webmaster tools and ask it to remove it from the index.
3: there are still links to the old URL either on site or off site and since the CMS doesn't 301 the oid page it will index it again with a new URL.
4:the page still exists in the CMS because of some strange setting or equivalent in the CMS.
as mentioned before the easy fix is to use a robots.txt and deny access to the page and ask google to remove it from it's index. the better fix is to find the problem in the CMS and solve it. a midway fix could be to 301 it in the .htaccess or equvilent on an ISS server.
hope it helped
-
Thanks René,
I updated my earlier reply with a question that i think you missed.
The list isn't growing, which is a good thing but how is it possible for the crawler to pick up the duplicate page urls when the the referrer page has the correct urls?
-
I have come across this sort of issue a gazillion times + infinite.. almost all of our clients seem to have dub cont problems of one kind or another
but often it is different things that is the problem. But I'm afraid that I can't point you in the right direction, since I have no experience with your CMS. To be able to do that I would need to have access to the site itself. (since I don't know the CMS.) My advice would be to get a developer on the issue or to grab hold of the support for the CMS (if any.)
-
Hi René,
Thanks for your reply and suggestions. It could well be CMS remembering old urls as this list isn't growing. But is the crawler able to pickup the old urls when the referrer page has the correct urls?
We are on Expression Engine. Have you come across this sort of issue before?
-
Well it kinda have to be in the CMS, since it has 2 different paths.. But you could fix it by going to the .htaccess (if you have access and redirect it to the right URL and make a robots.txt and disallow access to the page.
if the page has been moved to a new location theres a good chance that the CMS is setup to remember the old URL and show the page. This is indeed a problem, but a potential problem with the CMS.
Go to webmaster tools and ask them to delete the dublicate from thier index.
You specific problem could originate from a ton of different problems and it is kinda har to fix without direct access to everything. What CMS is it your using?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Content Issues: Duplicate Content
Hi there
Technical SEO | | Kingagogomarketing
Moz flagged the following content issues, the page has duplicate content and missing canonical tags.
What is the best solution to do? Industrial Flooring » IRL Group Ltd
https://irlgroup.co.uk/industrial-flooring/ Industrial Flooring » IRL Group Ltd
https://irlgroup.co.uk/index.php/industrial-flooring Industrial Flooring » IRL Group Ltd
https://irlgroup.co.uk/index.php/industrial-flooring/0 -
Joomla: content accesible through all kinds of other links >> duplicate content?!
When i did a site: search on Google i've noticed all kind of URL's on my site were indexed, while i didn't add them to the Joomla navigation (or they were not linked anywhere on the site). Some examples: www.domain.com/1-articlename >> that way ALL articles are publicly visible, even if they are not linked to a menu-item... If by accident such a link get's shared it will be indexed in google, you can have 2 links with same content... www.domain.com/2-uncategorised >> same with categories, automatically these overview pages are visible to people who know this URL. On it you see all the articles that belong to that category. www.domain.com/component/content >> this gives an overview of all the categories inside your Joomla CMS I think most will agree this is not good for your site's SEO? But how can this be solved? Is this some kind of setting within Joomla? Anyone who dealt with these problems already?
Technical SEO | | conversal0 -
Duplicate Content for Multiple Instances of the Same Product?
Hi again! We're set to launch a new inventory-based site for a chain of car dealers with various locations across the midwest. Here's our issue: The different branches have overlap in the products that they sell, and each branch is adamant that their inventory comes up uniquely in site search. We don't want the site to get penalized for duplicate content; however, we don't want to implement a link rel=canonical because each product should carry the same weight in search. We've talked about having a basic URL for these product descriptions, and each instance of the inventory would be canonicalized to this main product, but it doesn't really make sense for the site structure to do this. Do you have any tips on how to ensure that these products (same description, new product from manufacturer) won't be penalized as duplicate content?
Technical SEO | | newwhy0 -
Duplicate content
I have two page, where the second makes a duplicate content from the first Example:www.mysite.com/mypagewww.mysite.com/mysecondpageIf i insert still making duplicate content?Best regards,Wendel
Technical SEO | | peopleinteractive0 -
Problem with duplicate content
Hi, My problem is this: SEOmoz tells me I have duplicate content because it is picking up my index page in three different ways: http://www.web-writer-articles.co.uk http://www.web-writer-articles.co.uk/ and http://www.web-writer-articles.co.uk/index.php Can someone give me some advice as to how I can deal with this issue? thank you for your time, louandel15
Technical SEO | | louandel150 -
API for testing duplicate content
Does anyone know a service or API or php lib to compare two (or more) pages and to return their similiarity (Level-3-Shingles). API would be greatly prefered.
Technical SEO | | Sebes0 -
Duplicate Page Content and Titles
A few weeks ago my error count went up for Duplicate Page Content and Titles. 4 errors in all. A week later the errors were gone... But now they are back. I made changes to the Webconfig over a month ago but nothing since. SEOmoz is telling me the duplicate content is this http://www.antiquebanknotes.com/ and http://www.antiquebanknotes.com Thanks for any advise! This is the relevant web.config. <rewrite><rules><rule name="CanonicalHostNameRule1"><match url="(.*)"><conditions><add input="{HTTP_HOST}" pattern="^www.antiquebanknotes.com$" negate="true"></add></conditions>
Technical SEO | | Banknotes
<action type="Redirect" url="<a href=" http:="" www.antiquebanknotes.com="" {r:1"="">http://www.antiquebanknotes.com/{R:1}" />
</action></match></rule>
<rule name="Default Page" enabled="true" stopprocessing="true"><match url="^default.aspx$"><conditions logicalgrouping="MatchAll"><add input="{REQUEST_METHOD}" pattern="GET"></add></conditions>
<action type="Redirect" url="/"></action></match></rule></rules></rewrite>0 -
How can i resolve Duplicate Page Content?
Hello, I have created one campaign over SEOmoz tools for my website AutoDreams.it i have found 159 duplicate page content. My problem is that this web site is about car adsso it is easy to create pages with duplicate content and also Car ads are placed byregistered users. How can i resolve this problem? Regards Francesco
Technical SEO | | francesco870