Duplicate Content Issue

hfranz

Very strange issue I noticed today. In my SEOMoz Campaigns I noticed thousands of Warnings and Errors!

I noticed that any page on my website ending in .php can be duplicated by adding anything you want to the end of the url, which seems to be causing these issues.

Ex:

Normal URL - www.example.com/testing.php

Duplicate URL - www.example.com/testing.php/helloworld

The duplicate URL displays the page without the images, but all the text and information is present, duplicating the Normal page.

I Also found that many of my PDFs seemed to be getting duplicated burried in directories after directories, which I never ever put in place.

Ex: www.example.com/catalog/pdfs/testing.pdf/pdfs/another.pdf/pdfs/more.pdfs/pdfs/ ...

when the pdfs are only located in a pdfs directory!

I am very confused on how to fix this problem. Maybe with some sort of redirect?

Cyrus-Shepard

Hi Hfranz,

I took a look at your campaign and was unable to duplicate the errors, which leads me to believe it was a blip in the crawling. I'm speculating here, but my suspicion is that there was a miscommunication between your web server and the SEOmoz crawler, the end result being the crawler got sent on a wrong crawl path, and your server kept delivering the bad URLs.

Hopefully, things will return to normal after the next crawl. If not, feel free to contact the help team at help@seomoz.org.

You can also double check with Google Webmaster Tools, to see if there has been an increase in crawl errors or html suggestions.

Finally, your web server should be configured to deliver a 404 for these URLs that don't exist (www.example.com/testing.php/helloworld) Unfortunately, I can't tell you exactly how to do this, and you may need to find a developer with expertise in this area - but my guess is very little, or no, damage has been done.

jim_cetin

to me it seems that there is more an issue with the php script.

is helloworld the testing.php?

the pdf issue could be a referencing issue within your script, even if the pdf's are loaded into the correct folder. check the upload function, especially where you give the doc the URI!

SEODinosaur

Read this

http://www.seomoz.org/blog/canonical-url-tag-the-most-important-advancement-in-seo-practices-since-sitemaps

All you really need is a canonical code.

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Duplicate Content Issue

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Canonical Tags for Legacy Duplicate Content

How to handle one section of duplicate content

Best Way to Handle Near-Duplicate Content?

Duplicate content and rel canonicals?

Avoiding Cannibalism and Duplication with content

Duplicate Content

Worpress Tags Duplicate Content

Duplicate content and http and https