"Duplicate" Page Titles and Content
-
Hi All,
This is a rather lengthy one, so please bear with me!
SEOmoz has recently crawled 10,000 webpages from my site, FrenchEntree, and has returned 8,000 errors of duplicate page content. The main reason I have so many is because of the directories I have on site.
The site is broken down into 2 levels of hierachy. "Weblets" and "Articles". A weblet is a landing page, and articles are created within these weblets. Weblets can hold any number of articles - 0 - 1,000,000 (in theory) and an article must be assigned to a weblet in order for it to work. Here's how it roughly looks in URL form - http://www.mysite.com/[weblet]/[articleID]/
Now; our directory results pages are weblets with standard content in the left and right hand columns, but the information in the middle column is pulled in from our directory database following a user query. This happens by adding the query string to the end of the URL. We have 3 main directory databases, but perhaps around 100 weblets promoting various 'canned' queries that users may want to navigate straight into. However, any one of the 100 directory promoting weblets could return any query from the parent directory database with the correct query string. The problem with this method (as pointed out by the 8,000 errors) is that each possible permutation of search is considered to be it's own URL, and therefore, it's own page.
The example I will use is the first alphabetically. "Activity Holidays in France":
http://www.frenchentree.com/activity-holidays-france/ - This link shows you a results weblet without the query at the end, and therefore only displays the left and right hand columns as populated.
http://www.frenchentree.com/activity-holidays-france/home.asp?CategoryFilter= - This link shows you the same weblet with the an 'open' query on the end. I.e. display all results from this database. Listings are displayed in the middle.
There are around 500 different URL permutations for this weblet alone when you take into account the various categories and cities a user may want to search in.
What I'd like to do is to prevent SEOmoz (and therefore search engines) from counting each individual query permutation as a unique page, without harming the visibility that the directory results received in SERPs. We often appear in the top 5 for quite competitive keywords and we'd like it to stay that way. I also wouldn't want the search engine results to only display (and therefore direct the user through to) an empty weblet by some sort of robot exclusion or canonical classification.
Does anyone have any advice on how best to remove the "duplication" problem, whilst keeping the search visibility? All advice welcome.
Thanks
Matt
-
Thanks for the swift response, Gianluca. I think I understand the problem you have pointed out, but I'm rather surprised that it has been set up in such a way... Or that that would have more of an adverse affect than multiple URLs with the same standard content. I'm willing to change that to see if it fixes the problem though.
Please take all of the time you need... It is a very large site which has been pieced together, bit-by-bit, over many years!
Matt
-
In addition to Gianluca's response there, the pages that you tag with "noindex,follow" (i.e. the duplicates) add a canonical tag pointing at the original page.
-
I think your problem of duplicated content is also due the pagination your categories (or no categories search result) have. Checking the second url you gave http://www.frenchentree.com/activity-holidays-france/home.asp?order=Sort1&option=&CategoryFilter=&webname=activity-holidays-france&webname=activity-holidays-france&pagenumber=1 and it "second" page http://www.frenchentree.com/activity-holidays-france/home.asp?order=Sort1&option=&CategoryFilter=&webname=activity-holidays-france&pagenumber=2 I noticed that you have the meta robots in the head... therefore the bots see and index all these paginated content, that is a substantial duplicate of page 1. I suggest you to start adding the noindex,follow meta robots in these pages. About other duplication issues... give me time, as your site is not so easy
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
B2B site targeting 20,000 companies with 20,000 dedicated "target company pages" on own website.
An energy company I'm working with has decided to target 20,000 odd companies on their own b2b website, by producing a new dedicated page per target company on their website - each page including unique copy and a sales proposition (20,000 odd new pages to optimize! Yikes!). I've never come across such an approach before... what might be the SEO pitfalls (other than that's a helluva number of pages to optimize!). Any thoughts would be very welcome.
Intermediate & Advanced SEO | | McTaggart0 -
Duplicated privacy policy pages
I work for a small web agency and I noticed that many of the sites that we build have been using the same privacy policy. Obviously it can be a bit of a nightmare to write a unique privacy policy for each client so is Google likely to class this as duplicate content and result in a penalty? They must realise that privacy policies are likely to be the same or very similar as most legal writing tends to be! I can block the content in robots.txt or meta no-index it if necesarry but I just wanted to get some feedback to see if this is necessary!
Intermediate & Advanced SEO | | Jamie.Stevens1 -
Subcategories within "New Arrivals" section - duplicate content?
Hi there, My client runs an e-commerce store selling shoes that features a section called "New Arrivals" with subcategories, such as "shoes," "wedges," "boots," "sandals," etc. There are already main subcategories on the site that target these terms. These are specifically pages for "New Arrivals - Boots," etc. The shoes listed on each new arrivals subcategory page are also listed in the main subcategory page. Given that there is not really any search volume for "Brand + new arrivals in boots," but lots of search volume for "Brand + boots," what is the proper way to handle these new arrivals subcategory pages? Should each subcategory have a rel=canonical tag pointing to the main subcategory? Should they be de-indexed? Should I keep them all indexed but try to make the content as unique as possible? Thank you!
Intermediate & Advanced SEO | | FPD_NYC0 -
Can too many "noindex" pages compared to "index" pages be a problem?
Hello, I have a question for you: our website virtualsheetmusic.com includes thousands of product pages, and due to Panda penalties in the past, we have no-indexed most of the product pages hoping in a sort of recovery (not yet seen though!). So, currently we have about 4,000 "index" page compared to about 80,000 "noindex" pages. Now, we plan to add additional 100,000 new product pages from a new publisher to offer our customers more music choice, and these new pages will still be marked as "noindex, follow". At the end of the integration process, we will end up having something like 180,000 "noindex, follow" pages compared to about 4,000 "index, follow" pages. Here is my question: can this huge discrepancy between 180,000 "noindex" pages and 4,000 "index" pages be a problem? Can this kind of scenario have or cause any negative effect on our current natural SEs profile? or is this something that doesn't actually matter? Any thoughts on this issue are very welcome. Thank you! Fabrizio
Intermediate & Advanced SEO | | fablau0 -
Can use of the id attribute to anchor t text down a page cause page duplication issues?
I am producing a long glossary of terms and want to make it easier to jump down to various terms. I am using the<a id="anchor-text" ="" attribute="" so="" am="" appending="" #anchor-text="" to="" a="" url="" reach="" the="" correct="" spot<="" p=""></a> <a id="anchor-text" ="" attribute="" so="" am="" appending="" #anchor-text="" to="" a="" url="" reach="" the="" correct="" spot<="" p="">Does anyone know whether Google will pick this up as separate duplicate pages?</a> <a id="anchor-text" ="" attribute="" so="" am="" appending="" #anchor-text="" to="" a="" url="" reach="" the="" correct="" spot<="" p="">If so any ideas on what I can do? Apart from not do it to start with? I am thinking 301s won't work as I want the URL to work. And rel=canonical won't work as there is no actual page code to add it to. Many thanks for your help Wendy</a>
Intermediate & Advanced SEO | | Chammy0 -
Need help with duplicate content. Same content; different locations.
We have 2 sites that will have duplicate content (e.g., one company that sells the same products under two different brand names for legal reasons). The two companies are in different geographical areas, but the client will put the same content on each page because they're the same product. What is the best way to handle this? Thanks a lot.
Intermediate & Advanced SEO | | Rocket.Fuel0 -
Is it possible to "undo" canonical tags as unique content is created?
We will soon be launching an education site that teaches people how to drive (not really the topic, but it will do). We plan on being content rich and have plans to expand into several "schools" of driving. Currently, content falls into a number of categories, for example rules of the road, shifting gears, safety, etc. We are going to group content into general categories that apply broadly, and then into "schools" where the content is meant to be consumed in a specific order. So, for example, some URLs in general categories may be: drivingschool.com/safety drivingschool.com/rules-of-the-road drivingschool.com/shifting-gears etc. Then, schools will be available for specific types of vehicles. For example, drivingschool.com/cars drivingschool.com/motorbikes etc. We will provide lessons at the school level, and in the general categories. This is where it gets tricky. If people are looking for general content, then we want them to find pages in the general categories (for example, drivingschool.com/rules-of-the-road/traffic-signs). However, we have very similar content within each of the schools (for example, drivingschool.com/motorbikes/rules-of-the-road/traffic-signs). As you could imagine, sometimes the content is very unique between the various schools and the general category (such as in shifting), but often it is very similar or even nearly duplicate (as in the example above). The problem is that in the schools we want to say at the end of the lesson, "after this lesson, take the next lesson about speed limits for motorcycles" so there is a very logical click-path through the school. Unfortunately this creates potential duplicate content issues. The best solution I've come up with is to include a canonical tag (pointing to the general version of the page) whenever there is content that is virtually identical. There will be cases though where we adjust the content "down the road" 🙂 to be more unique and more specific for the school. At that time we'd want to remove the canonical tag. So two questions: Does anyone have any better ideas of how to handle this duplicate content? If we implement canonical tags now, and in 6 months update content to be more school-specific, will "undoing" the canonical tag (and even adding a self-referential tag) work for SEO? I really hope someone has some insight into this! Many thanks (in advance).
Intermediate & Advanced SEO | | JessicaB0 -
Duplicate page content
Hi. I am getting error of having duplicate content on my website and pages its showing there are: www.mysitename.com www.mysitename.com/index.html As my best knowledge it only one page, I know this can be solved with some conical tag used in header, but do not know how. Can anyone please tell me about that code or any other way to get this solved. Thanks
Intermediate & Advanced SEO | | onlinetraffic0