Duplicate Content Issue
-
Very strange issue I noticed today. In my SEOMoz Campaigns I noticed thousands of Warnings and Errors!
I noticed that any page on my website ending in .php can be duplicated by adding anything you want to the end of the url, which seems to be causing these issues.
Ex:
Normal URL - www.example.com/testing.php
Duplicate URL - www.example.com/testing.php/helloworld
The duplicate URL displays the page without the images, but all the text and information is present, duplicating the Normal page.
I Also found that many of my PDFs seemed to be getting duplicated burried in directories after directories, which I never ever put in place.
Ex: www.example.com/catalog/pdfs/testing.pdf/pdfs/another.pdf/pdfs/more.pdfs/pdfs/ ...
when the pdfs are only located in a pdfs directory!
I am very confused on how to fix this problem. Maybe with some sort of redirect?
-
Hi Hfranz,
I took a look at your campaign and was unable to duplicate the errors, which leads me to believe it was a blip in the crawling. I'm speculating here, but my suspicion is that there was a miscommunication between your web server and the SEOmoz crawler, the end result being the crawler got sent on a wrong crawl path, and your server kept delivering the bad URLs.
Hopefully, things will return to normal after the next crawl. If not, feel free to contact the help team at help@seomoz.org.
You can also double check with Google Webmaster Tools, to see if there has been an increase in crawl errors or html suggestions.
Finally, your web server should be configured to deliver a 404 for these URLs that don't exist (www.example.com/testing.php/helloworld) Unfortunately, I can't tell you exactly how to do this, and you may need to find a developer with expertise in this area - but my guess is very little, or no, damage has been done.
-
to me it seems that there is more an issue with the php script.
- is helloworld the testing.php?
the pdf issue could be a referencing issue within your script, even if the pdf's are loaded into the correct folder. check the upload function, especially where you give the doc the URI!
-
Read this
All you really need is a canonical code.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate content on Places to Stay listings pages
Hello, I've just crawled our website https://www.i-escape.com/ to find we have a duplicate content issue. Every places to stay listing page has identical content (over 1,500 places) due to the fact it's based on user searches or selections. If we hide this pages using canonical tags, will we lose our visibility for each country and/or region we promote hotels? Any help on this would be hugely appreciated! Thanks so much Clair
Technical SEO | | iescape0 -
Finding a specific link - Duplicating my own content
Hi Mozzers, This may be a bit of a n00b question and i feel i should know the answer but alas, here i am asking. I have a page www.website.co.uk/page/ and im getting a duplicate page report of www.website.co.uk/Page/ i know this is because somewhere on my website a link will exists using the capitalised version. I have tried everything i can think of to find it but with no luck, any little tricks? I could always rewrite the urls to lowercase, but I have downloadable software etc also on the website that i dont want to take the capitals out of. So the best solution seems to be finding the link and remove it. Most link checkers I use treat the capitalised and non capitalised as the same thing so really arent helping lol.
Technical SEO | | ATP0 -
Duplicate Content Mystery
Hi Moz community! I have an ongoing duplicate mystery going on here and I'm hoping someone here can answer my question. We have an Ecommerce site that has a variety of product pages and category pages. There are Rel canonicals in place, along with parameters in GWT, and there are also URL rewrites. Here are some scenarios, maybe you can give insight as to what’s exactly going on and how to fix it. All the duplicates look to be coming from category pages specifically. For example:
Technical SEO | | Ecom-Team-Access
This link re-writes: http://www.incipio.com/cases/tablet-cases/amazon-kindle-cases-sleeves.html?cat=407&color=152&price=20- To: http://www.incipio.com/cases/tablet-cases/amazon-kindle-cases-sleeves.html The rel canonical tag looks like this: http://www.incipio.com/cases/tablet-cases/amazon-kindle-cases-sleeves.html" /> The CONTENT is different, but the URLs are the same. It thinks that the product category view is the same as the all products view, even though there is a canonical in there telling it which one is the original. Some of them don’t have anything to do with each other. Take a look: Link identified as duplicate: http://www.incipio.com/cases/smartphone-cases/htc-smartphone-cases/htc-windows-phone-8x-cases.html?color=27&price=20- Link this is a duplicate of: http://www.incipio.com/cases/macbook-cases/macbook-pro-13in-cases.html Any idea as to what could be happening here?0 -
How to avoid duplicate content
Hi, I have a website which is ranking on page 1: www.oldname.com/landing-page But because of legal reason i had to change the name.
Technical SEO | | mikehenze
So i moved the landing page to a different domain.
And 301'ed this landing page to the new domain (and removed all products). www.newname.com/landing-page All the meta data, titles, products are still the same. www.oldname.com/landing-page is still on the same position
And www.newname.com/landing-page was on page 1 for 1 day and is now on page 4. What did i do wrong and how can I fix this?
Maybe remove www.oldname.com/landing-page from Google with Google Webmaster Central or not allow crawling of this page with .htaccess ?0 -
174 Duplicate Content Errors
How do I go about fixing these errors? There are all related to my tags. Thank you in advance for any help! Lisa
Technical SEO | | lisarein0 -
How to avoid duplicate content penalty when our content is posted on other sites too ?
For recruitment company sites, their job ads are posted muliple times on thier own sites and even on other sites too. These are the same ads (job description is same) posted on diff. sites. How do we avoid duplicate content penalty in this case?
Technical SEO | | Personnel_Concept0 -
What's the best way to solve this sites duplicate content issues?
Hi, The site is www.expressgolf.co.uk and is an e-commerce website with lots of categories and brands. I'm trying to achieve one single unique URL for each category / brand page to avoid duplicate content and to get the correct URL's indexed. Currently it looks like this... Main URL http://www.expressgolf.co.uk/shop/clothing/galvin-green Different Versions http://www.expressgolf.co.uk/shop/clothing/galvin-green/ http://www.expressgolf.co.uk/shop/clothing/galvin-green/1 http://www.expressgolf.co.uk/shop/clothing/galvin-green/2 http://www.expressgolf.co.uk/shop/clothing/galvin-green/3 http://www.expressgolf.co.uk/shop/clothing/galvin-green/4 http://www.expressgolf.co.uk/shop/clothing/galvin-green/all http://www.expressgolf.co.uk/shop/clothing/galvin-green/1/ http://www.expressgolf.co.uk/shop/clothing/galvin-green/2/ http://www.expressgolf.co.uk/shop/clothing/galvin-green/3/ http://www.expressgolf.co.uk/shop/clothing/galvin-green/4/ http://www.expressgolf.co.uk/shop/clothing/galvin-green/all/ Firstly, what is the best course of action to make all versions point to the main URL and keep them from being indexed - Canonical Tag, NOINDEX or block them in robots? Secondly, do I just need to 301 the (/) from all URL's to the non (/) URL's ? I'm sure this question has been answered but I was having trouble coming to a solution for this one site. Cheers, Paul
Technical SEO | | paulmalin0 -
Duplicate content by category name change
Hello friends, I have several problems with my website related with duplicate content. When we changed any family name, for example "biodiversidad" to "cajas nido y biodiversidad", it creates a duplicate content because: mydomain.com/biodiversidad and mydomain.com/cajas-nido-y-biodiversidad have the same content. This happens every tame I change the names of the categories or families. To avoid this, the first thing that comes to my mid is a 301 redirect from the old to the new url, but I wonder if this can be done more automatically otherwise, maybe a script? Any suggestion? Thank you
Technical SEO | | pasape0