"Duplicate" Page Titles and Content
-
Hi All,
This is a rather lengthy one, so please bear with me!
SEOmoz has recently crawled 10,000 webpages from my site, FrenchEntree, and has returned 8,000 errors of duplicate page content. The main reason I have so many is because of the directories I have on site.
The site is broken down into 2 levels of hierachy. "Weblets" and "Articles". A weblet is a landing page, and articles are created within these weblets. Weblets can hold any number of articles - 0 - 1,000,000 (in theory) and an article must be assigned to a weblet in order for it to work. Here's how it roughly looks in URL form - http://www.mysite.com/[weblet]/[articleID]/
Now; our directory results pages are weblets with standard content in the left and right hand columns, but the information in the middle column is pulled in from our directory database following a user query. This happens by adding the query string to the end of the URL. We have 3 main directory databases, but perhaps around 100 weblets promoting various 'canned' queries that users may want to navigate straight into. However, any one of the 100 directory promoting weblets could return any query from the parent directory database with the correct query string. The problem with this method (as pointed out by the 8,000 errors) is that each possible permutation of search is considered to be it's own URL, and therefore, it's own page.
The example I will use is the first alphabetically. "Activity Holidays in France":
http://www.frenchentree.com/activity-holidays-france/ - This link shows you a results weblet without the query at the end, and therefore only displays the left and right hand columns as populated.
http://www.frenchentree.com/activity-holidays-france/home.asp?CategoryFilter= - This link shows you the same weblet with the an 'open' query on the end. I.e. display all results from this database. Listings are displayed in the middle.
There are around 500 different URL permutations for this weblet alone when you take into account the various categories and cities a user may want to search in.
What I'd like to do is to prevent SEOmoz (and therefore search engines) from counting each individual query permutation as a unique page, without harming the visibility that the directory results received in SERPs. We often appear in the top 5 for quite competitive keywords and we'd like it to stay that way. I also wouldn't want the search engine results to only display (and therefore direct the user through to) an empty weblet by some sort of robot exclusion or canonical classification.
Does anyone have any advice on how best to remove the "duplication" problem, whilst keeping the search visibility? All advice welcome.
Thanks
Matt
-
Thanks for the swift response, Gianluca. I think I understand the problem you have pointed out, but I'm rather surprised that it has been set up in such a way... Or that that would have more of an adverse affect than multiple URLs with the same standard content. I'm willing to change that to see if it fixes the problem though.
Please take all of the time you need... It is a very large site which has been pieced together, bit-by-bit, over many years!
Matt
-
In addition to Gianluca's response there, the pages that you tag with "noindex,follow" (i.e. the duplicates) add a canonical tag pointing at the original page.
-
I think your problem of duplicated content is also due the pagination your categories (or no categories search result) have. Checking the second url you gave http://www.frenchentree.com/activity-holidays-france/home.asp?order=Sort1&option=&CategoryFilter=&webname=activity-holidays-france&webname=activity-holidays-france&pagenumber=1 and it "second" page http://www.frenchentree.com/activity-holidays-france/home.asp?order=Sort1&option=&CategoryFilter=&webname=activity-holidays-france&pagenumber=2 I noticed that you have the meta robots in the head... therefore the bots see and index all these paginated content, that is a substantial duplicate of page 1. I suggest you to start adding the noindex,follow meta robots in these pages. About other duplication issues... give me time, as your site is not so easy
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate Content Question With New Domain
Hey Everyone, I hope your day is going well. I have a question regarding duplicate content. Let's say that we have Website A and Website B. Website A is a directory for multiple stores & brands. Website B is a new domain that will satisfy the delivery niche for these multiple stores & brands (where they can click on a "Delivery" anchor on Website A and it'll redirect them to Website B). We want Website B to rank organically when someone types in " <brand>delivery" in Google. Website B has NOT been created yet. The Issue Website B has to be a separate domain than Website A (no getting around this). Website B will also pull all of the content from Website A (menus, reviews, about, etc). Will we face any duplicate content issues on either Website A or Website B in the future? Should we rel=canonical to the main website even though we want Website B to rank organically?</brand>
Intermediate & Advanced SEO | | imjonny0 -
Ranking of Moz "A" grade page.
Hello, I built a site in Weebly recently and it was indexed by Google and the one page in fact ranked #1 for one keyword. I used absolutely no SEO optimization techniques for this. It then rapidly dropped out of sight (not surprising ). I have now optimized the site in general and specifically the page www.insolvencylifeline.co.za/voluntary-sequestration-process as recommended by Moz. All the optimization was on-page, except that I also used the SEOProfiler tool to submit the site to their list of search engines recommended and I manually linked to a number of reputable directories. I did this on 09/03. If I search for www.insolvencylifeline.co.za/voluntary-sequestration-process I can see the page has been cached on 10/3. However,if I search for any of my 3 search terms for example "voluntary sequestration" and then do an advanced search for "insolvencylifeline", I only get search results for pages cached before 9/3. My page www.insolvencylifeline.co.za/voluntary-sequestration-process which I know is fully optimized (“A” Moz grade) for the search term, does not rank at all. Also if I search for www.insolvencylifeline.co.za, I can see that the page also was cached on 10/3. However, it does not show www.insolvencylifeline.co.za/voluntary-sequestration-process at all and the other pages shown were all cached before 9/3. Does this mean that the page www.insolvencylifeline.co.za/voluntary-sequestration-process does not rank at all even though it is indexed? If so, any thoughts on why? Regards, Gerhard.
Intermediate & Advanced SEO | | Gerrhard0 -
Penalized for Similar, But Not Duplicate, Content?
I have multiple product landing pages that feature very similar, but not duplicate, content and am wondering if this would affect my rankings in a negative way. The main reason for the similar content is three-fold: Continuity of site structure across different products Similar, or the same, product add-ons or support options (resulting in exactly the same additional tabs of content) The product itself is very similar with 3-4 key differences. Three examples of these similar pages are here - although I do have different meta-data and keyword optimization through the pages. http://www.1099pro.com/prod1099pro.asp http://www.1099pro.com/prod1099proEnt.asp http://www.1099pro.com/prodW2pro.asp
Intermediate & Advanced SEO | | Stew2220 -
Artist Bios on Multiple Pages: Duplicate Content or not?
I am currently working on an eComm site for a company that sells art prints. On each print's page, there is a bio about the artist followed by a couple of paragraphs about the print. My concern is that some artists have hundreds of prints on this site, and the bio is reprinted on every page,which makes sense from a usability standpoint, but I am concerned that it will trigger a duplicate content penalty from Google. Some people are trying to convince me that Google won't penalize for this content, since the intent is not to game the SERPs. However, I'm not confident that this isn't being penalized already, or that it won't be in the near future. Because it is just a section of text that is duplicated, but the rest of the text on each page is original, I can't use the rel=canonical tag. I've thought about putting each artist bio into a graphic, but that is a huge undertaking, and not the most elegant solution. Could I put the bio on a separate page with only the artist's info and then place that data on each print page using an <iframe>and then put a noindex,nofollow in the robots.txt file?</p> <p>Is there a better solution? Is this effort even necessary?</p> <p>Thoughts?</p></iframe>
Intermediate & Advanced SEO | | sbaylor0 -
Rel="prev" and rel="next" implementation
Hi there since I've started using semoz I have a problem with duplicate content so I have implemented on all the pages with pagination rel="prev" and rel="next" in order to reduce the number of errors but i do something wrong and now I can't figure out what it is. the main page url is : alegesanatos.ro/ingrediente/ and for the other pages : alegesanatos.ro/ingrediente/p2/ - for page 2 alegesanatos.ro/ingrediente/p3/ - for page 3 and so on. We've implemented rel="prev" and rel="next" according to google webmaster guidelines without adding canonical tag or base link in the header section and we still get duplicate meta title error messages for this pages. Do you think there is a problem because we create another url for each page instead of adding parameters (?page=2 or ?page=3 ) to the main url alegesanatos.ro/ingrediente?page=2 thanks
Intermediate & Advanced SEO | | dan_panait0 -
Can I reduce number of on page links by just adding "no follow" tags to duplicate links
Our site works on templates and we essentially have a link pointing to the same place 3 times on most pages. The links are images not text. We are over 100 links on our on page attributes, and ranking fairly well for key SERPS our core pages are optimized for. I am thinking I should engage in some on-page link juice sculpting and add some "no follow" tags to 2 of the 3 repeated links. Although that being said the Moz's on page optimizer is not saying I have link cannibalization. Any thoughts guys? Hope this scenario makes sense.
Intermediate & Advanced SEO | | robertrRSwalters0 -
Duplicate content on index.htm page
How do I avoid duplicate content on the index.htm page . I need to redirect the spider from the /index.htm file to the main root of http://www.manandhisvan.com.au and hence avoid duplicate content. Does anyone know of a foolproof way of achieving this without me buggering up the complete site Cheers Freddy
Intermediate & Advanced SEO | | Fatfreddy0 -
Question about "launching to G" a new site with 500000 pages
Hey experts, how you doing? Hope everything is ok! I'm about to launch a new website, the code is almost done. Totally fresh new domain. The site will have like 500000 pages, fully internal optimized of course. I got my taticts to make G "travel" over my site to get things indexed. The problem is: to release it in "giant mode" or release it "thin" and increase the pages over the time? What do you recomend? Release the big G at once and let them find the 500k pages (do they think this can be a SPAM or something like that)? Or release like 1k/2k per day? Anybody know any good aproach to improve my chances of success here? Any word will be apreciated. Thanks!
Intermediate & Advanced SEO | | azaiats20