Duplicate Content Issue
-
Very strange issue I noticed today. In my SEOMoz Campaigns I noticed thousands of Warnings and Errors!
I noticed that any page on my website ending in .php can be duplicated by adding anything you want to the end of the url, which seems to be causing these issues.
Ex:
Normal URL - www.example.com/testing.php
Duplicate URL - www.example.com/testing.php/helloworld
The duplicate URL displays the page without the images, but all the text and information is present, duplicating the Normal page.
I Also found that many of my PDFs seemed to be getting duplicated burried in directories after directories, which I never ever put in place.
Ex: www.example.com/catalog/pdfs/testing.pdf/pdfs/another.pdf/pdfs/more.pdfs/pdfs/ ...
when the pdfs are only located in a pdfs directory!
I am very confused on how to fix this problem. Maybe with some sort of redirect?
-
Hi Hfranz,
I took a look at your campaign and was unable to duplicate the errors, which leads me to believe it was a blip in the crawling. I'm speculating here, but my suspicion is that there was a miscommunication between your web server and the SEOmoz crawler, the end result being the crawler got sent on a wrong crawl path, and your server kept delivering the bad URLs.
Hopefully, things will return to normal after the next crawl. If not, feel free to contact the help team at help@seomoz.org.
You can also double check with Google Webmaster Tools, to see if there has been an increase in crawl errors or html suggestions.
Finally, your web server should be configured to deliver a 404 for these URLs that don't exist (www.example.com/testing.php/helloworld) Unfortunately, I can't tell you exactly how to do this, and you may need to find a developer with expertise in this area - but my guess is very little, or no, damage has been done.
-
to me it seems that there is more an issue with the php script.
- is helloworld the testing.php?
the pdf issue could be a referencing issue within your script, even if the pdf's are loaded into the correct folder. check the upload function, especially where you give the doc the URI!
-
Read this
All you really need is a canonical code.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate Page Content Issue
Hello, I recently solved www / no www duplicate issue for my website, but now I am in trouble with duplicate content again. This time something that I cannot understand happens: In Crawl Issues Report, I received Duplicate Page Content for http://yourappliancerepairla.com (DA 19) http://yourappliancerepairla.com/index.html (DA 1) Could you please help me figure out what is happenning here? By default, index.html is being loaded, but this is the only index.html I have in the folder. And it looks like the crawler sees two different pages with different DA... What should I do to handle this issue?
Technical SEO | | kirupa0 -
Duplicate content through product variants
Hi, Before you shout at me for not searching - I did and there are indeed lots of threads and articles on this problem. I therefore realise that this problem is not exactly new or unique. The situation: I am dealing with a website that has 1 to N (n being between 1 and 6 so far) variants of a product. There are no dropdown for variants. This is not technically possible short of a complete redesign which is not on the table right now. The product variants are also not linked to each other but share about 99% of content (obvious problem here). In the "search all" they show up individually. Each product-variant is a different page, unconnected in backend as well as frontend. The system is quite limited in what can be added and entered - I may have some opportunity to influence on smaller things such as enabling canonicals. In my opinion, the optimal choice would be to retain one page for each product, the base variant, and then add dropdowns to select extras/other variants. As that is not possible, I feel that the best solution is to canonicalise all versions to one version (either base variant or best-selling product?) and to offer customers a list at each product giving him a direct path to the other variants of the product. I'd be thankful for opinions, advice or showing completely new approaches I have not even thought of! Kind Regards, Nico
Technical SEO | | netzkern_AG0 -
174 Duplicate Content Errors
How do I go about fixing these errors? There are all related to my tags. Thank you in advance for any help! Lisa
Technical SEO | | lisarein0 -
Duplicate content /index.php/ issues
I'm having some duplicate content issues with Google. I've already got my .htaccess file working just fine as far as I can tell. Rewriting works great, and by using the site you'd never end up on a page with /index.php. However I do notice that on ANY page of the site you could add /index.php and get the same page i.e.: www.mysite.com/category/article and www.mysite.com/index.php/category/article Would both return the same page. How can I 301 or something similar all /index.php pages to the non index.php version? I have no desire for any page on my site to have index.php in it, there is no use to it. Having quite the hard time figuring this out. Again this is basically just for the robots, the URL's the users see are perfect, never had an issue with that. Just SEOMOZ reporting duplicate content and I've verified that to be true.
Technical SEO | | b18turboef1 -
Question about duplicate content in crawl reports
Okay, this one's a doozie: My crawl report is listing all of these as separate URLs with identical duplicate content issues, even though they are all the home page and the one that is http://www.ccisolutions.com (the preferred URL) has a canonical tag of rel= http://www.ccisolutions.com: http://www.ccisolutions.com http://ccisolutions.com http://www.ccisolutions.com/StoreFront/IAFDispatcher?iafAction=showMain I will add that OSE is recognizing that there is a 301-redirect on http://ccisolutions.com, but the duplicate content report doesn't seem to recognize the redirect. Also, every single one of our 404-error pages (we have set up a custom 404 page) is being identified as having duplicate content. The duplicate content on all of them is identical. Where do I even begin sorting this out? Any suggestions on how/why this is happening? Thanks!
Technical SEO | | danatanseo1 -
Duplicate Page Content Report
In Crawl Diagnostics Summary, I have 2000 duplicate page content. When I click the link, my Wordpress return "page not found" and I see it's not indexed by Google, and I could not find the issue in Google Webmaster. So where does this link come from?
Technical SEO | | smallwebsite0 -
Duplicate Homepage issue
SEOMOZ says my site has two homepages: www.mysite.com www.mysite.com/ When you go to "www.mysite.com/" the URL changes to "www.mysite.com" Why is this happening and what can I do about it?
Technical SEO | | LucasF0 -
Help removing duplicate content from the index?
Last week, after a significant drop in traffic, I noticed a subdomain in the index with duplicate content. The main site and subdomain can be found below. http://mobile17.com http://232315.mobile17.com/ I've 301'd everything on the subdomain to the appropriate location on the main site. Problem is, site: searches show me that if the subdomain content is being deindexed, it's happening really slowly. Traffic is still down about 50% in the last week or so... what's the best way to tackle this issue moving forward?
Technical SEO | | ccorlando0