XML feeds in regards to Duplicate Content
-
Hi everyone
I hope you can help. I run a property portal in Spain and am looking for an answer to an issue we are having.
We are in the process of uploading an XML feed to our site which contains 10,000+ properties relating to our niche.
Although this is great for our customers I am aware this content is going to be duplicated from other sites as our clients advertise over a range of portals.
My question is, are there any measures I can take to safeguard our site from penalisation from Google?
Manually writing up 10,000 + descriptions for properties is out of the question sadly.
I really hope somebody can help
Thanks
Steve
-
Yeah I understand. I think I will take your advice and no follow the pages. It is not worth the risk really.
Thanks for your help
-
I undertand exactly what you are saying but from experience I can tell you that trying to get them indexed would lead a lot of grief. If those listings are all over the net then your website ceases to offer any value to Google and hence it will simply demote you.
I understand it is a tricky situation and it is hard to let go of the idea that these pages should be indexed so you would get rankings for long tail keywords. I guess it is really a question of how much you want to gamble.
-
Hi Mash,
Thanks for your reply. Its such a difficult issue because those pages create deep content which allows us to rank for long tail phrases.
For example, 3 bedroom apartment for sale in Barcelona. So yeah its a tough one, and most of the content on our website would be those property pages. So no indexing them would reduce the amount of content we offer.
What a pickle. Any more help you can provide is appreciated
Thanks
-
Hi Steve,
There is no easy way to prevent duplicate penalty with such a large number of pages being pushed into the site, short of no-indexing those pages.
Perhaps a noindex, follow placed into the head of all these pages would help. I assume you would not want these indexed anyway because that is where the problem will raise it's ugly head!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Third part http links on the page source: Social engineering content warning from Google
Hi, We have received "Social engineering content" warning from Google and one of our important page and it's internal pages have been flagged as "Deceptive site ahead". We wonder what's the reason behind this as Google didn't point exactly to the specific part of the page which made us look so to the Google. We don't employ any such content on the page and the content is same for many months. As our site is WP hosted, we used a WordPress plugin for this page's layout which injected 2 http (non-https) links in our page code. We suspect if this is the reason behind this? Any ideas? Thanks
White Hat / Black Hat SEO | | vtmoz1 -
Question Regarding Keyword
Hi All, I am currently working on a travel site. Would for example Boutique Hotels in New York | Luxury Hotels in New York be considered keyword stuffing?
White Hat / Black Hat SEO | | appnyc0 -
Thin Content Pages: Adding more content really help?
Hello all, So I have a website that was hit hard by Panda back in 2012 November, and ever since the traffic continues to die week by week. The site doesnt have any major moz errors (aside from too many on page links). The site has about 2,700 articles and the text to html ratio is about 14.38%, so clearly we need more text in our articles and we need to relax a little on the number of pictures/links we add. We have increased the text to html ratio for all of our new articles that we put out, but I was wondering how beneficial it is to go back and add more text content to the 2,700 old articles that we have just sitting. Would this really be worth the time and investment? Could this help the drastic decline in traffic and maybe even help it grow?
White Hat / Black Hat SEO | | WebServiceConsulting.com0 -
How does Google decide what content is "similar" or "duplicate"?
Hello all, I have a massive duplicate content issue at the moment with a load of old employer detail pages on my site. We have 18,000 pages that look like this: http://www.eteach.com/Employer.aspx?EmpNo=26626 http://www.eteach.com/Employer.aspx?EmpNo=36986 and Google is classing all of these pages as similar content which may result in a bunch of these pages being de-indexed. Now although they all look rubbish, some of them are ranking on search engines, and looking at the traffic on a couple of these, it's clear that people who find these pages are wanting to find out more information on the school (because everyone seems to click on the local information tab on the page). So I don't want to just get rid of all these pages, I want to add content to them. But my question is... If I were to make up say 5 templates of generic content with different fields being replaced with the schools name, location, headteachers name so that they vary with other pages, will this be enough for Google to realise that they are not similar pages and will no longer class them as duplicate pages? e.g. [School name] is a busy and dynamic school led by [headteachers name] who achieve excellence every year from ofsted. Located in [location], [school name] offers a wide range of experiences both in the classroom and through extra-curricular activities, we encourage all of our pupils to “Aim Higher". We value all our teachers and support staff and work hard to keep [school name]'s reputation to the highest standards. Something like that... Anyone know if Google would slap me if I did that across 18,000 pages (with 4 other templates to choose from)?
White Hat / Black Hat SEO | | Eteach_Marketing0 -
Merging four sites into one... Best way to combine content?
First of all, thank you in advance for taking the time to look at this. The law firm I work for once took a "more is better" approach and had multiple websites, with keyword rich domains. We are a family law firm, but we have a specific site for "Arizona Child Custody" as one example. We have four sites. All four of our sites rank well, although I don't know why. Only one site is in my control, the other three are managed by FindLaw. I have no idea why the FindLaw sites do well, other than being in the FindLaw directory. They have terrible spammy page titles, and using Copyscape, I realize that most of the content that FindLaw provides for it's attorneys are "spun articles." So I have a major task and I don't know how to begin. First of all, since all four sites rank well for all of the desired phrases-- will combining all of that power into one site rocket us to stardom? The sites all rank very well now, even though they are all technically terrible. Literally. I would hope that if I redirect the child custody site (as one example) to the child custody overview page on the final merged site, we would still maintain our current SERP for "arizona child custody lawyer." I have strongly encouraged my boss to merge our sites for many reasons. One of those being that it's playing havoc with our local places. On the other hand, if I take down the child custody site, redirect it, and we lose that ranking, I might be out of a job. Finally, that brings me down to my last question. As I mentioned, the child custody site is "done" very poorly. Should I actually keep the spun content and redirect each and every page to a duplicate on our "final" domain, or should I redirect each page to a better article? This is the part that I fear the most. I am considering subdomains. Like, redirecting the child custody site to childcustody.ourdomain.com-- I know, for a fact, that will work flawlessly. I've done that many times for other clients that have multiple domains. However, we have seven areas of practice and we don't have 7 nice sites. So child custody would be the only legal practice area that has it's own subdomain. Also, I wouldn't really be doing anything then, would I? We all know 301 redirects work. What I want is to harness all of this individual power to one mega-site. Between the four sites, I have 800 pages of content. I need to formulate a plan of action now, and then begin acting on it. I don't want to make the decision alone. Anybody care to chime in? Thank you in advance for your help. I really appreciate the time it took you to read this.
White Hat / Black Hat SEO | | SDSLaw0 -
Showing pre-loaded content cloaking?
Hi everyone, another quick question. We have a number of different resources available for our users that load dynamically as the user scrolls down the page (like Facebook's Timeline) with the aim of improving page load time. Would it be considered cloaking if we had Google bot index a version of the page with all available content that would load for the user if he/she scrolled down to the bottom?
White Hat / Black Hat SEO | | CuriosityMedia0 -
User comments with page content or as a separate page?
With the latest Google updates in both cracking down on useless pages and concentrating on high quality content, would it be beneficial to include user posted comments on the same page as the content or a separate page? Having a separate page with enough comments on it would he worth warranting, especially as extra pages add extra pagerank but would it be better to include them with the original article/post? Your ideas and suggestions are greatly appreciated.
White Hat / Black Hat SEO | | Peter2640 -
My attempt to reduce duplicate content got me slapped with a doorway page penalty. Halp!
On Friday, 4/29, we noticed that we suddenly lost all rankings for all of our keywords, including searches like "bbq guys". This indicated to us that we are being penalized for something. We immediately went through the list of things that changed, and the most obvious is that we were migrating domains. On Thursday, we turned off one of our older sites, http://www.thegrillstoreandmore.com/, and 301 redirected each page on it to the same page on bbqguys.com. Our intent was to eliminate duplicate content issues. When we realized that something bad was happening, we immediately turned off the redirects and put thegrillstoreandmore.com back online. This did not unpenalize bbqguys. We've been looking for things for two days, and have not been able to find what we did wrong, at least not until tonight. I just logged back in to webmaster tools to do some more digging, and I saw that I had a new message. "Google Webmaster Tools notice of detected doorway pages on http://www.bbqguys.com/" It is my understanding that doorway pages are pages jammed with keywords and links and devoid of any real content. We don't do those pages. The message does link me to Google's definition of doorway pages, but it does not give me a list of pages on my site that it does not like. If I could even see one or two pages, I could probably figure out what I am doing wrong. I find this most shocking since we go out of our way to try not to do anything spammy or sneaky. Since we try hard not to do anything that is even grey hat, I have no idea what could possibly have triggered this message and the penalty. Does anyone know how to go about figuring out what pages specifically are causing the problem so I can change them or take them down? We are slowly canonical-izing urls and changing the way different parts of the sites build links to make them all the same, and I am aware that these things need work. We were in the process of discontinuing some sites and 301 redirecting pages to a more centralized location to try to stop duplicate content. The day after we instituted the 301 redirects, the site we were redirecting all of the traffic to (the main site) got blacklisted. Because of this, we immediately took down the 301 redirects. Since the webmaster tools notifications are different (ie: too many urls is a notice level message and doorway pages is a separate alert level message), and the too many urls has been triggering for a while now, I am guessing that the doorway pages problem has nothing to do with url structure. According to the help files, doorway pages is a content problem with a specific page. The architecture suggestions are helpful and they reassure us they we should be working on them, but they don't help me solve my immediate problem. I would really be thankful for any help we could get identifying the pages that Google thinks are "doorway pages", since this is what I am getting immediately and severely penalized for. I want to stop doing whatever it is I am doing wrong, I just don't know what it is! Thanks for any help identifying the problem! It feels like we got penalized for trying to do what we think Google wants. If we could figure out what a "doorway page" is, and how our 301 redirects triggered Googlebot into saying we have them, we could more appropriately reduce duplicate content. As it stands now, we are not sure what we did wrong. We know we have duplicate content issues, but we also thought we were following webmaster guidelines on how to reduce the problem and we got nailed almost immediately when we instituted the 301 redirects.
White Hat / Black Hat SEO | | CoreyTisdale0