Subdomains - duplicate content - robots.txt

EasyStreet

Our corporate site provides MLS data to users, with the end goal of generating leads. Each registered lead is assigned to an agent, essentially in a round robin fashion. However we also give each agent a domain of their choosing that points to our corporate website. The domain can be whatever they want, but upon loading it is immediately directed to a subdomain. For example, www.agentsmith.com would be redirected to agentsmith.corporatedomain.com. Finally, any leads generated from agentsmith.easystreetrealty-indy.com are always assigned to Agent Smith instead of the agent pool (by parsing the current host name). In order to avoid being penalized for duplicate content, any page that is viewed on one of the agent subdomains always has a canonical link pointing to the corporate host name (www.corporatedomain.com). The only content difference between our corporate site and an agent subdomain is the phone number and contact email address where applicable.

Two questions:

Can/should we use robots.txt or robot meta tags to tell crawlers to ignore these subdomains, but obviously not the corporate domain?
If question 1 is yes, would it be better for SEO to do that, or leave it how it is?

SeoStallion

Sorry, god only knows how I missed that.

Well in that case I think you are doing what is recomended, I generally think of the canonical tag as similar to a 301 redirect. You are telling the search engines that the two pages should be treated as one and then specifying the page that is to be the front-man of the two.

I think the normal proceedure is to have robot.txt for private/personal information, nofollow and noindex for duplicate content however the canonical tag is an easy solution to duplicate content as it is simply one line in the header.

EasyStreet

Thanks SeoStallion.

That is how we are handling it currently.

SeoStallion

I would personally suggest using the canonical tag to identify the original content. For example place this into the header of the pages with duplicate content:

This will ensure that the search engines know that it is not the original content and that the page in the link is where the original content is found.

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Subdomains - duplicate content - robots.txt

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Robots.txt was set to disallow for 14 days

Internal Duplicate Content - Classifieds (Panda)

What is considered duplicate content?

Robot.txt File Not Appearing, but seems to be working?

Is all duplication of HTML title content bad?

Http and https duplicate content?

Effect duration of robots.txt file.

Fixing Duplicate Content Errors