Steps you can take to ensure your content is indexed and registered to your site before a scraper gets to it?
-
Hi,
A clients site has significant amounts of original content that has blatantly been copied and pasted in various other competitor and article sites.
I'm working with the client to rejig lots of this content and to publish new content.
What steps would you recommend to undertake when the new, updated site is launched to ensure Google clearly attributes the content to the clients site first?
One thing I will be doing is submitting a new xml + html sitemap.
Thankyou
-
There are no "best practices" established for the tags' usage at this point. On the one hand, it could technically be used for every page, and on the other, should only be used when it's an article, blog post, or other individual person's writing.
-
Thanks Alan.
Guess there's no magic trick that will give you 100% attribution.
Regarding this tag, do you recommend I add this to EVERY page of the clients website including the homepage? So even the usual about us/contact etc pages?
Cheers
Hash
-
Google continually tries to find new ways to encourage solutions for helping them understand intent, relevance, ownership and authority. It's why Schema.org finally hit this year. None of their previous attempts have been good enough, and each has served a specific individual purpose.
So with Schema, the theory is there's a new, unified framework that can grow and evolve, without having to come up with individual solutions.
The "original source" concept was supposed to address the scraper issue, and there's been some value in that, though it's far from perfect. A good scraper script can find it, strip it out or replace the contents.
rel="author" is yet one more thing that can be used in the overall mix, though Schema.org takes authorship and publisher identity to a whole new, complex, and so far confused level :-).
Since Schema.org is most likely not going to be widely adopted til at least early next year, Google's encouraging use of the rel="author" tag as the primary method for assigning authorship at this point, and will continue to support it even as Schema rolls out.
So if you're looking at a best practices solution, yes, rel="author" is advisable. Until it's not.
-
Thanks Alan... I am surprised to learn about this "original source" information. There must not have been a lot of talk about it when it was released or I would have seen it.
Google recently started encouraging people to use the rel="author" attribute. I am going to use that on my site... now I am wondering if I should be using "original source" too.
Are you recommending rel="author"?
Also, reading that full post there is a section added at the end recommending rel="canonical"
-
Always have a sitemap.xml file with all the URLs you want indexed included in it. Right after publishing, submit the sitemap.xml file (or files if there are tens of thousands of pages) through Google Webmaster Tools and Bing Webmaster Tools. Include the Meta "original-source" tag in your page headers.
Include a Copyright line at the bottom of each page with the site or company name, and have that link to the home page.
This does not guarantee with 100% certainty that you'll get proper attribution, however these are the best steps you can take in that regard.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Cached version of my site is not showing content?
Hi mozzers, I am a bit worried since I looked a cache version of my site and somehow content is partially showing up and navigation has completely disappeared. Where could this come from? What should I be doing? Thanks!
Intermediate & Advanced SEO | | Taysir0 -
Question about getting domain name re-indexed
I recently swapped my domain from www.davescomputers.com to www.computer-help.com . Originally www.computer-help.com was 301 re-directing to www.davescomputers.com ...however my long term goal is to eventually rebrand my business so I decided to utilize the other domain by swapping the main domain. Is consistant blogging the best way to get Google to re-index the entire website? My focus has been quality posts and sharing them with vairus social profiles I created.
Intermediate & Advanced SEO | | DavidMolnar0 -
How many outbound links, we can add to our blog content?
We can add any number of links? from authority websites? Like - http://packforcity.com/save-money-on-a-trip-to-disneyland/ http://packforcity.com/how-to-save-money-when-traveling-to-disneyland-part-2/ http://packforcity.com/new-york-city-top-faqs-for-travels/ http://packforcity.com/what-to-wear-in-kyoto-in-february/
Intermediate & Advanced SEO | | bondhoward0 -
Getting Rid Of Spammy 301 Links From An Old Site
A relatively new site I'm working on has been hit really hard by Panda, due to over optimization of 301 external links which include exact keyword phrases, from an old site. Prior to the Panda update, all of these 301 redirects worked like a charm, but now all of these 301's from the old url are killing the new site, because all the hyper-text links include exact keyword matches. A couple weeks ago, I took the old site completely down, and removed the htaccess file, removing the 301's and in effect breaking all of these bad links. Consequently, if one were to type this old url, you'd be directed to the domain registrar, and not redirected to the new site. My hope is to eliminate most of the bad links, that are mostly on spammy sites, that aren't worth linking to. My thought is these links would eventually disappear from G. My concern is that this might not work, because G won't re-index these links, because once they're indexed by G, they'll be there forever. My fear is causing me to conclude I should hedge my bets, and just disavow these sites using the disavow tool in WMT. IMO, the disavow tool is an action of last resort, because I don't want to call attention to myself, since this site doesn't have a manual penalty inflected on it. Any opinions or advise would be greatly appreciated.
Intermediate & Advanced SEO | | alrockn0 -
Can I use the same set of social media accounts on two sites?
I have a client who's company name / main site name is not his name. All his social accounts connected to his site are in his name. The site is verified with Google Places, etc. Now he asked for me to create a site for him in his own name with some similar info and a lot of new info. The million dollar question is do I use the same social media accounts on the new site? Facebook, twitter, youtube, etc? Will that hurt the organic rankings of the main site? I've seen this similar situation before. You may have someone who works at a large corporation who is mentioned on the main site has their own personal profile site just about them. Where they can go more in depth about things they are doing. If the other set of social account where in the company name it would be a no brainer to create new social accounts in his name. The issue is the main companies social accounts are already in his name.
Intermediate & Advanced SEO | | markpine3600 -
Our Site's Content on a Third Party Site--Best Practices?
One of our clients wants to use about 200 of our articles on their site, and they're hoping to get some SEO benefit from using this content. I know standard best practices is to canonicalize their pages to our pages, but then they wouldn't get any benefit--since a canonical tag will effectively de-index the content from their site. Our thoughts so far: add a paragraph of original content to our content link to our site as the original source (to help mitigate the risk of our site getting hit by any penalties) What are your thoughts on this? Do you think adding a paragraph of original content will matter much? Do you think our site will be free of penalty since we were the first place to publish the content and there will be a link back to our site? They are really pushing for not using a canonical--so this isn't an option. What would you do?
Intermediate & Advanced SEO | | nicole.healthline1 -
Domain w/ Identical Content to Site we are Optimizing
Hi Guys, We've been optimizing a client's site for about a year or so now and on a call the other day the client brought up that he owns and operates another site that's marketing the same product, but to a difference audience (we work on the direct to consumer side, this is a distributior focused site),with the same exact content as the site we are optimizing. Obviously this is a major duplcant content issue and we need to get it resolved very quickjly. We have already reccomendt to the client that we re-write content, but this is where my questions comes in - Which site should we rewrite the content on? The site we are optimizing is the more impoorant of the two, while we still want the other site to hold rankings we dont want to end up accidently optimizing the other site wherein the site we are working on full time suffers a lost when a "compeiting" site creates compeltely new content and may, accidentally, end up ranking higher than the site we are focusing on full time. As links also play a role, would that be a KPI to look at here in determining which site gets new content and which does not? In this scenairo, would would you guys recommend? Just want to make sure I'm dotting all my I's, and crossing T's here. Many thanks to all in advance, Mike
Intermediate & Advanced SEO | | Havas_Disco0 -
How to resolve Duplicate Page Content issue for root domain & index.html?
SEOMoz returns a Duplicate Page Content error for a website's index page, with both domain.com and domain.com/index.html isted seperately. We had a rewrite in the htacess file, but for some reason this has not had an impact and we have since removed it. What's the best way (in an HTML website) to ensure all index.html links are automatically redirected to the root domain and these aren't seen as two separate pages?
Intermediate & Advanced SEO | | ContentWriterMicky0