"Duplicate" Page Titles and Content
-
Hi All,
This is a rather lengthy one, so please bear with me!
SEOmoz has recently crawled 10,000 webpages from my site, FrenchEntree, and has returned 8,000 errors of duplicate page content. The main reason I have so many is because of the directories I have on site.
The site is broken down into 2 levels of hierachy. "Weblets" and "Articles". A weblet is a landing page, and articles are created within these weblets. Weblets can hold any number of articles - 0 - 1,000,000 (in theory) and an article must be assigned to a weblet in order for it to work. Here's how it roughly looks in URL form - http://www.mysite.com/[weblet]/[articleID]/
Now; our directory results pages are weblets with standard content in the left and right hand columns, but the information in the middle column is pulled in from our directory database following a user query. This happens by adding the query string to the end of the URL. We have 3 main directory databases, but perhaps around 100 weblets promoting various 'canned' queries that users may want to navigate straight into. However, any one of the 100 directory promoting weblets could return any query from the parent directory database with the correct query string. The problem with this method (as pointed out by the 8,000 errors) is that each possible permutation of search is considered to be it's own URL, and therefore, it's own page.
The example I will use is the first alphabetically. "Activity Holidays in France":
http://www.frenchentree.com/activity-holidays-france/ - This link shows you a results weblet without the query at the end, and therefore only displays the left and right hand columns as populated.
http://www.frenchentree.com/activity-holidays-france/home.asp?CategoryFilter= - This link shows you the same weblet with the an 'open' query on the end. I.e. display all results from this database. Listings are displayed in the middle.
There are around 500 different URL permutations for this weblet alone when you take into account the various categories and cities a user may want to search in.
What I'd like to do is to prevent SEOmoz (and therefore search engines) from counting each individual query permutation as a unique page, without harming the visibility that the directory results received in SERPs. We often appear in the top 5 for quite competitive keywords and we'd like it to stay that way. I also wouldn't want the search engine results to only display (and therefore direct the user through to) an empty weblet by some sort of robot exclusion or canonical classification.
Does anyone have any advice on how best to remove the "duplication" problem, whilst keeping the search visibility? All advice welcome.
Thanks
Matt
-
Thanks for the swift response, Gianluca. I think I understand the problem you have pointed out, but I'm rather surprised that it has been set up in such a way... Or that that would have more of an adverse affect than multiple URLs with the same standard content. I'm willing to change that to see if it fixes the problem though.
Please take all of the time you need... It is a very large site which has been pieced together, bit-by-bit, over many years!
Matt
-
In addition to Gianluca's response there, the pages that you tag with "noindex,follow" (i.e. the duplicates) add a canonical tag pointing at the original page.
-
I think your problem of duplicated content is also due the pagination your categories (or no categories search result) have. Checking the second url you gave http://www.frenchentree.com/activity-holidays-france/home.asp?order=Sort1&option=&CategoryFilter=&webname=activity-holidays-france&webname=activity-holidays-france&pagenumber=1 and it "second" page http://www.frenchentree.com/activity-holidays-france/home.asp?order=Sort1&option=&CategoryFilter=&webname=activity-holidays-france&pagenumber=2 I noticed that you have the meta robots in the head... therefore the bots see and index all these paginated content, that is a substantial duplicate of page 1. I suggest you to start adding the noindex,follow meta robots in these pages. About other duplication issues... give me time, as your site is not so easy
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Why is "Noindex" better than a "Canonical" for Pagination?
"Noindex" is a suggested pagination technique here: http://searchengineland.com/the-latest-greatest-on-seo-pagination-114284, and everyone seems to agree that you shouldn't canonicalize all pages in a series to the first page, but I'd love if someone can explain why "noindex" is better than a canonical?
Intermediate & Advanced SEO | | nicole.healthline0 -
My home page is not found by the "Grade a Page" tool
My home page as well as several important pages are not found by the Grade a Page tool. With our full https address I got this http://screencast.com/t/s1gESMlGwpa With just the www address I got this http://screencast.com/t/BMRHy36Ih https://www.joomlashack.com
Intermediate & Advanced SEO | | etabush
https://www.joomlashack.com/joomla-templates We recently lost a lot of positions for our most important keyword: Joomla Templates Please help us figure this out. Whats screwy with our site?0 -
Moving some content to a new domain - best practices to avoid duplicate content?
Hi We are setting up a new domain to focus on a specific product and want to use some of the content from the original domain on the new site and remove it from the original. The content is appropriate for the new domain and will be irrelevant for the original domain and we want to avoid creating completely new content. There will be a link between the two domains. What is the best practice for this to avoid duplicate content and a potential Panda penalty?
Intermediate & Advanced SEO | | Citybase0 -
Duplicate content on the same page--is this an issue?
We are transitioning to responsive design and some of our pages will not scale properly, so we were thinking of adding the same content twice to the same URL (one would be simple text -- for mobile and the other would include the images, etc for the desktop version), and content would change based on size of the screen. I'm not looking for another technical solution (I know google specifies that you can dynamically serve different content based on user agent)--I am wondering if any one knows if having the same exact content appear twice on the same URL will cause a problem with SEO (any historical tests or experience would be great). Thank you in advance.
Intermediate & Advanced SEO | | nicole.healthline0 -
Duplicate Content on Press Release?
Hi, We recently held a charity night in store. And had a few local celebs turn up etc... We created a press release to send out to various media outlets, within the press release were hyperlinks to our site and links on certain keywords to specific brands on our site. My question is, should we be sending a different press release to each outlet to stop the duplicate content thing, or is sending the same release out to everyone ok? We will be sending approx 20 of these out, some going online and some not. So far had one local paper website, a massive football website and a local magazine site. All pretty much same content and a few pics. Any help, hints or tips on how to go about this if I am going to be sending out to a load of other sites/blogs? Cheers
Intermediate & Advanced SEO | | YNWA0 -
Duplicate content for images
On SEOmoz I am getting duplicate errors for my onsite report. Unfortunately it does not specify what that content is... We are getting these errors for our photo gallery and i am assuming that the reason is some of the photos are listed in multiple categories. Can this be the problem? what else can it be? how can we resolve these issues?
Intermediate & Advanced SEO | | SEODinosaur0 -
Where does "Pages Similar" link text come from?
When I type in a competitor name (in this case "buycostumes") Google shows several related websites in it's "Pages Similar to..." section at the bottom of the page:
Intermediate & Advanced SEO | | costumeMy question, can anyone tell me where the text comes from that Google uses as the link. Our competitors have nice branded links and our is just a keyword. I can find nothing on-page that Google is using so it must be coming from someplace off-page, but where?
0 -
Does a Single Instance of rel="nofollow" cause all instances on a page to be nofollowed?
I attended the Bruce Clay training at SMX Advanced Seattle, and he mentioned link pruning/sculpting (here's an SEOMoz article about it - http://www.seomoz.org/blog/google-says-yes-you-can-still-sculpt-pagerank-no-you-cant-do-it-with-nofollow) Now during his presentation he mentioned that if you have one page with multiple links leading to another page, and one of those links is nofollowed, it could cause all links to be nofollowed. Example: Page A has 4 links to Page B: 1:followed, 2:followed, 3:nofollowed, 4:followed The presence of a single nofollow tag would override the 3 followed links and none of them would pass link juice. Has anyone else encountered this problem, and Is there any evidence to support this? I'm thinking this would make a great experiment.
Intermediate & Advanced SEO | | brycebertola0