Penalized by duplicate content?
-
Hello,
I am in a very weird position. I am managing a website(EMD) which a part of it dynamically creates pages. The former webmaster who create this system though that this would help with SEO but I dought!
The thing is that now the site has about 1500 pages which must look duplicate but are they really duplicate? Each page has a unique URL but the content is pretty much the same: one image and a different title with 5-8 words.
There is more: All these pages are not accessible by the users but only for the crawlers!!! This URL machine is a part of a php - made photo gallery which i never understood the sense of it!
The site overall is not performing very well in SERP, especially after Penguin, but judging by the link profile, the Domain authority, construction (ok besides that crazy photo gallery) and content, it never reached the position it should have in the past.
The majority of these mysterious pages - and mostly their images - are cached by Google and some of them are in top places to some SERP - the ones that match the small title on page - but the numbers are poor, 10 - 15 clicks per month.
Are these pages considered as duplicated, although they are cached, and is it safe for the site just to remove 1500 at once?
The seomoz tools have pointed some of them as dups but the majority not!
Can these pages impact the image of the whole site in search engines?( drop in Google and has disappeared from Yahoo and Bing!)
Do I also have to tell Google about the removal?
I have not seen anything like it before so any comment would be helpful!
Thank you!
-
Mat,
There was a massive production of pages in the mid October 2011 and there was a drop in traffic around November - there was a panda update then.
The problem is that for that the certain niche there is always a small drop for the site concerning Oct, Nov and Dec so it is not so clear to judge!
-
Hard to say without knowing the detail of what is on the pages. However it sounds like a perfect set-up for a site to be hit my the Panda updates to me. This is exactly what Panda was built for!
It could be worth checking your traffic levels alongside the dates on this page for a good idea of what changes have already impacted on your site: http://www.seomoz.org/google-algorithm-change (I like to put those dates in as events on Google Analytics). However, even if you haven't yet been hit by this I'd suggest you are risking it.
If you have a lot of "thin content" pages then this can impact on the whole site. Generated pages are probably the quickest way to hit such problems.
You don't need to inform google that you have removed them. Just remove the pages and be sure that it either returns a 404 error or does a 301 redirect to the most logical (not thin) page.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Sitemap and Privacy Policy marked for duplicate content?
On a recent crawl, Moz flagged a page of our site for duplicate content. However, the pages listed are our sitemap and our privacy policy -- both very different: http://elearning.smp.org/sitemap/ http://elearning.smp.org/privacy-policy/ What is our best option to address this issue? I had considered a noindex tag on the privacy policy page, but since we have enabled user insights in Google Analytics we need to have the privacy policy displayed and I worry that putting a noindex on the page would cause problems later.
Web Design | | calliek0 -
Does having too many wordpress portfolio pages with little content hurt a site's SEO?
I have a site that is for a service company, not image based like a photographer or artist. We utilize the Portfolio feature to create a gallery of floor coating finishes (images of all the flooring finish options available) but this solution has created /portfolio/file-name pages for each image. These pages have no other content besides the image. I've run SEMrush audits on this site which shows a high percentage of pages with low text/code ratio and duplicate content (a lot of the finishes have very similar names). This site has been extremely slow to improve any visibility online (more than 9 months) and I'm wondering if this is a factor by possibly having a negative effect on our site. We initially chose the portfolio option because it was the best-looking solution for our users but we can certainly change it to another format if that is better. Thanks!
Web Design | | WillGMG0 -
Hiding content until user scrolls - Will Google penalize me?
I've used: "opacity:0;" to hide sections of my content, which are triggered to show (using Javascript) once the user scrolls over these sections. I remember reading a while back that Google essentially ignores content which is hidden from your page (it mentioned they don't index it, so it's close to impossible to rank for it). Is this still the case? Thanks, Sam
Web Design | | Sam.at.Moz0 -
Problems preventing Wordpress attachment pages from being indexed and from being seen as duplicate content.
Hi According to a Moz Crawl, it looks like the Wordpress attachment pages from all image uploads are being indexed and seen as duplicate content..or..is it the Yoast sitemap causing it? I see 2 options in SEO Yoast: Redirect attachment URLs to parent post URL. Media...Meta Robots: noindex, follow I set it to (1) initially which didn't resolve the problem. Then I set it to option (2) so that all images won't be indexed but search engines would still associate those images with their relevant posts and pages. However, I understand what both of these options (1) and (2) mean, but because I chose option 2, will that mean all of the images on the website won't stand a chance of being indexed in search engines and Google Images etc? As far as duplicate content goes, search engines can get confused and there are 2 ways for search engines
Web Design | | SEOguy1
to reach the correct page content destination. But when eg Google makes the wrong choice a portion of traffic drops off (is lost hence errors) which then leaves the searcher frustrated, and this affects the seo and ranking of the site which worsens with time. My goal here is - I would like all of the web images to be indexed by Google, and for all of the image attachment pages to not be indexed at all (Moz shows the image attachment pages as duplicates and the referring site causing this is the sitemap url which Yoast creates) ; that sitemap url has been submitted to the search engines already and I will resubmit once I can resolve the attachment pages issues.. Please can you advise. Thanks.0 -
Content Migration & cost of moving pages
Hope you are all having a great day! I am wondering if anyone would be able to provide general feedback. I work for a medium size company in Chicago. Currently our site is static html and we are seeking to migrate to Wordpress. After speaking with a number of website companies and receiving proposals, I am trying to understand if there is an approximate going rate or range for moving content from static html to a CMS like Wordpress? i.e. a cost per page? We don't have any dynamic content. Most of our pages are text and images. The site itself, including the blog is around 220 pages. Thanks in advance for any insight or resources!
Web Design | | SEOSponge0 -
How to handle International Duplicated Content?
Hi, We have multiple international E-Commerce websites. Usually our content is translated and doesn't interfere with each other, but how do search engines react to duplicate content on different TLDs? We have copied our Dutch (NL) store for Belgium (BE) and i'm wondering if we could be inflicting damage onto ourselves... Should I use: for every page? are there other options so we can be sure that our websites aren't conflicting? Are they conflicting at all? Alex
Web Design | | WebmasterAlex0 -
How do I identify what is causing my Duplicate Page Content problem?
Hello, I'm trying to put my finger on what exactly is causing my duplicate page content problem... For example, SEOMoz is picking up these four pages as having the same content: http://www.penncare.net/ambulancedivision/braunambulances/express.aspx http://www.penncare.net/ambulancedivision/recentdeliveries/millcreekparamedicservice.aspx http://www.penncare.net/ambulancedivision/recentdeliveries/monongaliaems.aspx http://www.penncare.net/softwaredivision/emschartssoftware/emschartsvideos.aspx As you can tell, they really aren't serving the same content in the body of the page. Anybody have an idea what might be causing these pages to show up as Duplicate Page Content? At first I thought it was the photo gallery module that might be causing it, but that only exists on two of the pages... Thanks in advance!
Web Design | | BGroup0 -
Two URLs with same content
We recently had a client who own multiple brands switch from having multiple urls to having a single domain with multiple sub domains. I've posted an example below to better explain. My question is the original url is still functional, so there are two urls with identical content, yet I haven't been getting a duplicate content error. Also, would a rel canonical link be beneficial in this case since the duplicate content is on two separate domains? My thoughts were to put a 301 redirect on the original pages so they permanently forward to the new sub-domain format. Is this the best course of action? If not, what would you recommend? Example: Original URLs
Web Design | | BluespaceCreative
www.example1.com
www.example2.com
www.example3.com
www.parentcompany.com New URLs
example1.parentcompany.com
example2.parentcompany.com
example3.parentcompany.com
www.parentcompany.com Let me know if this I need to clarify anything in better detail.
Thanks in advance!0