Duplicate content issue
-
Hi everyone,
I have an issue determining what type of duplicate content I have.
www.example.com/index.php?mact=Calendar,m57663,default,1&m57663return_id=116&m57663detailpage=&m57663year=2011&m57663month=6&m57663day=19&m57663display=list&m57663return_link=1&m57663detail=1&m57663lang=en_GB&m57663returnid=116&page=116
Since I am not an coding expert, to me it looks like it is a URL parameter duplicate content. Is it?
At the same time "return_id" would makes me think it is a session id duplicate content. I am confused about how to determine different types of duplicate content, even by reading articles on Seomoz about it: http://www.seomoz.org/learn-seo/duplicate-content.
Could someone help me on how to recognize different types of duplicate content?
Thank you!
-
Thank you guys for being so helpful!!:)
-
Hello Jeff, I would like to say first that lots of sites have duplicate content problems. For the most part, this is not a huge issue. When search engines find duplicate content they choose one of the pages to list in the index, and then will ignore the other. This assumes, of course, that the nature of the duplicate content is not so bad that it would lead to the search engine wanting to ban you. This can happen if a review of your situation causes them to believe that you are deliberately trying to rank multiple times for the same search terms.
Here is a link that fixes the problem of duplicate content :
http://www.seomoz.org/blog/duplicate-content-in-a-post-panda-world
-
Let me try.
1. The answer to your first question is that it only matters if you're trying to figure out how to handle it programmaticaly. In this case you might have to ask the developer if this is being done by a session id. To me it looks more like a URL parameter, but without a live example I wouldnt know, could you provide the website in question? If not try visiting the website once, clear your cache and then visit again and see if the number after "return_id" changes. if it changes that is a session id. If it stays the same have a friend visit the website in the same manor and see if the number stays the same, if it changes then there's a good chance that this is a session id.
No matter if it's a session id adding it or not "return_id" is technically a URL parameter that is triggered by a session id.
2. The second question is still a bit vague, so let me see if this is correct. are you asking how to treat the duplicate content once you know what is causing it? If so, then follow these rules.
If the content changes significantly in the presence of the session id or parameter then this is not duplicate content. If the content does change do the following:
- make sure to use rel canonical for the root URL. In your example that would be: www.example.com/index.php?mact=Calendar
- set the URL parameters in Google and Bings webmaster tools to treat the parameter correctly.
- When the parameter or session id is present add the noindex, follow robots tag. this will allow the bots to spider through and pass on link juice in the event that someone links to your parameter versions
I think you have a larger issue, which is that your website's code is using the index.php to generate all of the pages, in the example that is calendar. This is a common mistake that programmers make since they work to do things as quickly and efficiently as possible. Its far easier to keep all of the code in the one file than to create several different dynamic files that work with each other.
If you dont have the ability to break this down and generate out different pages you might be able to use URL Rewrites to make browsers and bots think the URLs are actually different.
-
Thank you for your answers but I guess I didn't formulate properly my question.
My 1st question was: What kind of duplicate content is it?
- session id
- or url parameter
My second question is: How do you differentiate them? What do you look at when a duplicate content is a session id one or a url parameter issue?
-
You can determine if you have duplicate content several ways. search in google site:example.com and see how many pages google knows at your website. Also, when you are on page with this crazy url, open source code and see if a page has rel="canonical" tag. In your page that would be the best solution to signal robot that this is the same page as your index.php page.
Also, you can try Xenu. good and fast program to run your site on duplicates.
Hope it helps, you can show your website so we can take a look.
-
Hi Jeff,
index.php is the same as index.php?something=something&anotherthing=somethinglese
Each page should have a different url like index.php and page.php instead of always using index.php
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate content issue
Hi, A client of ours has one URL for the moment (https://aalst.mobilepoint.be/) and wants to create a second one with exactly the same content (https://deinze.mobilepoint.be/). Will that mean Google punishes the second one because of duplicate content? What are the recommendations?
Technical SEO | | conversal0 -
Site Crawl -> Duplicate Page Content -> Same pages showing up with duplicates that are not
These, for example: | https://im.tapclicks.com/signup.php/?utm_campaign=july15&utm_medium=organic&utm_source=blog | 1 | 2 | 29 | 2 | 200 |
Technical SEO | | writezach
| https://im.tapclicks.com/signup.php?_ga=1.145821812.1573134750.1440742418 | 1 | 1 | 25 | 2 | 200 |
| https://im.tapclicks.com/signup.php?utm_source=tapclicks&utm_medium=blog&utm_campaign=brightpod-article | 1 | 119 | 40 | 4 | 200 |
| https://im.tapclicks.com/signup.php?utm_source=tapclicks&utm_medium=marketplace&utm_campaign=homepage | 1 | 119 | 40 | 4 | 200 |
| https://im.tapclicks.com/signup.php?utm_source=blog&utm_campaign=first-3-must-watch-videos | 1 | 119 | 40 | 4 | 200 |
| https://im.tapclicks.com/signup.php?_ga=1.159789566.2132270851.1418408142 | 1 | 5 | 31 | 2 | 200 |
| https://im.tapclicks.com/signup.php/?utm_source=vocus&utm_medium=PR&utm_campaign=52release | Any suggestions/directions for fixing or should I just disregard this "High Priority" moz issue? Thank you!0 -
Duplicate Content Brainstorming
Hi, New here in the SEO world. Excellent resources here. We have an ecommerce website that sells presentation templates. Today our templates come in 3 flavours - for PowerPoint, for Keynote and both - called Presentation Templates. So we've ended up with 3 URLS with similar content. Same screenshots, similar description.. Example: https://www.improvepresentation.com/keynote-templates/social-media-keynote-template https://www.improvepresentation.com/powerpoint-templates/social-media-powerpoint-template https://www.improvepresentation.com/presentation-templates/social-media-presentation-template I know what you're thinking. Why not make a website with a template and give 3 download options right? But what about https://www.improvepresentation.com/powerpoint-templates/ https://www.improvepresentation.com/keynote-templates/ These are powerfull URL's in my opinion taking into account that the strongest keyword in our field is "powerpoint templates" How would you solve this "problem" or maybe there is no problem at all.
Technical SEO | | slidescamp0 -
Query Strings causing Duplicate Content
I am working with a client that has multiple locations across the nation, and they recently merged all of the location sites into one site. To allow the lead capture forms to pre-populate the locations, they are using the query string /?location=cityname on every page. EXAMPLE - www.example.com/product www.example.com/product/?location=nashville www.example.com/product/?location=chicago There are thirty locations across the nation, so, every page x 30 is being flagged as duplicate content... at least in the crawl through MOZ. Does using that query string actually cause a duplicate content problem?
Technical SEO | | Rooted1 -
Duplicate content problem
Hi there, I have a couple of related questions about the crawl report finding duplicate content: We have a number of pages that feature mostly media - just a picture or just a slideshow - with very little text. These pages are rarely viewed and they are identified as duplicate content even though the pages are indeed unique to the user. Does anyone have an opinion about whether or not we'd be better off to just remove them since we do not have the time to add enough text at this point to make them unique to the bots? The other question is we have a redirect for any 404 on our site that follows the pattern immigroup.com/news/* - the redirect merely sends the user back to immigroup.com/news. However, Moz's crawl seems to be reading this as duplicate content as well. I'm not sure why that is, but is there anything we can do about this? These pages do not exist, they just come from someone typing in the wrong url or from someone clicking on a bad link. But we want the traffic - after all the users are landing on a page that has a lot of content. Any help would be great! Thanks very much! George
Technical SEO | | canadageorge0 -
Is this duplicate content?
All the pages have same information but content is little bit different, is this low quality and considered as duplicate content? I only trying to make services pages for each city, any other way for doing this. http://www.progressivehealthofpa.com/brain-injury-rehabilitation-pennsylvania/
Technical SEO | | JordanBrown
http://www.progressivehealthofpa.com/brain-injury-rehabilitation-new-york/
http://www.progressivehealthofpa.com/brain-injury-rehabilitation-new-jersey/
http://www.progressivehealthofpa.com/brain-injury-rehabilitation-connecticut/
http://www.progressivehealthofpa.com/brain-injury-rehabilitation-maryland/
http://www.progressivehealthofpa.com/brain-injury-rehabilitation-massachusetts/
http://www.progressivehealthofpa.com/brain-injury-rehabilitation-philadelphia/
http://www.progressivehealthofpa.com/brain-injury-rehabilitation-new-york-city/
http://www.progressivehealthofpa.com/brain-injury-rehabilitation-baltimore/
http://www.progressivehealthofpa.com/brain-injury-rehabilitation-boston/0 -
Duplicate Content
Hi, I'm working on a site and I'm having some issues with its structure causing duplicate content. The first issue is that the search pages will show up as duplicates.
Technical SEO | | OOMDODigital
A search for new inventory may be new.aspx
The duplicate may be something like new.aspx=page1, or something like that and so on. The second issue is with inventory. When new inventory gets put into the stock of the store, a new page for that item will be populated with duplicate content. There appears to be no canonical source for that page. How can I fix both of these? Thanks!0 -
Duplicate Content Issue
My issue with duplicate content is this. There are two versions of my website showing up http://www.example.com/ http://example.com/ What are the best practices for fixing this? Thanks!
Technical SEO | | OOMDODigital0