Blocking subdomains without blocking sites...
-
So let's say I am working for bloggingplatform.com, and people can create free sites through my tools and those sites show up as myblog.bloggingplatform.com. However that site can also be accessed from myblog.com.
Is there a way, separate from editing the myblog.com site code or files, for me to tell google to stop indexing myblog.bloggingplatform.com while still letting them index myblog.com without inserting any code into the page load?
This is a simplification of a problem I am running across.
Basically, Google is associating subdomains to my domain that it shouldn't even index, and it is adversely affecting my main domain. Other than contacting the offending sub-domain holders (which we do), I am looking for a way to stop Google from indexing those domains at all (they are used for technical purposes, and not for users to find the sites).
Thoughts?
-
Ah, I see now. Try this out http://moz.com/community/q/block-an-entire-subdomain-with-robots-txt#reply_26992 - basically, when a subdomain is identified, it would pull a different file into the robots.txt location (which would contain the disallow: / syntax)
Read the remaining comments about getting the subdomain removed via GWT.
-
You are correct, but that isn't what I was asking.
user1.bloggingplatform.com and myblog.com point to the same web server files. If I put up a robots.txt on user1.b... I would effectively de-index myblog.com.
The problem we have run accross is that user205.bloggingplatform.com might be doing something shady, but instead of de-listing the subdomain google kills the primary domain from the index as well.
Because user205.bloggingplatform.com should only be used for technical reasons, and not be in Google's index I am looking for a way to tell google not to index the sub-domain.
I think the better way to solve the problem would be to change the technical subdomain's domain though so change it from user205.bloggingplatform.com to user205.bloggingplatformtesting.com.
Then google can kill that URL all it wants as I don't care.
-
bloggingplatform.com/robots.txt
and
user1.bloggingplatform.com/robots.txt
can and should be different. If you disallow at the subdomain level, only the subdomain will be affected. You can search around for other examples of this but i'm certain it works (we have a development domain that is indexed and create subdomains for all clients that aren't indexed and done via individual robots.txt files)
-
I don't think that works. Since both URLs point to the same server the robots.txt file for the test URL would completely kill the main url.
Or am I missing something?
-
Each subdomain should have a robots.txt file that blocks that specific subdomain. e.g. user1.bloggingplatform.com/robots.txt should have:
User-agent: *
Disallow: /
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Making a site mobile friendly
Hey Mozzers, Im having a go at making our site mobile friendly without enlisting the help of developers and incorporating additional costs. I am ok with most of it as its just CSS work bar the odd occasion when i need to reposition some elements within the code. However, i have found myself wanting to use display:none {} on many elements that are just not practical on a mobile site. Some pages may have to hide substantial content. Would this be considered an issue or will google just see it as me hiding impractical elements for a different sized screen. I have googled this question for the past hour and there is a whole bunch of conflicting advice. As always, Many thanks
Technical SEO | | ATP0 -
Block bad crawlers
Hi! how are you? I've been working on some of my sites, and noticed that i'm getting lots of crawls by search engines that i'm not intereted in ranking well. My question is the following: do you have a list of 'bad behaved' search engines that take lots of bandwidth and don´t send much/good traffic? If so, do you know how to block them using robots.txt? Thanks for the help! Best wishes, Ariel
Technical SEO | | arielbortz0 -
Ranking Multi-Language Site
Recently we updated our website to a new version. Our website has a structure in which the English page is our main page with about 50 subpages. All these pages are translated in 5 different languages. The different languages are divided into folders. For example www.ourdomain.com/de containts all german pages. The pages with products would be for example: www.ourdomain.com/products for english and www.ourdomain.com/de/produkte for the german page. On our previous website this used to be simililar. After the website update the SEOMoz crawls are showning duplicated page content/title errors for the pages saying that the pages in other languages have the same content/title as the basis English webpage. Any idea how I can solve these errors?
Technical SEO | | Exp0 -
Site being indexed by Google before it has launched
We are currently coming towards the end of a site migration, and are at the final stage of testing redirects etc. However, to our horror we've just discovered Google has started indexing the new site. Any ideas on how this could have happened? I have most recently asked for robots.txt to exclude anything with a certain parameter in URL. Is there a chance this, wrongly implemented, could have caused this?
Technical SEO | | Sayers0 -
What to do next with my site gamblingsites.co
So I have this site gamblingsites.co, which I launched about a year ago (I think.) This used to be internetgamblingsites.net (a domain I bought, but never managed to get in the index, and it appeared to violate the T/Cs after asking in GWMT) and before that the site used to be casinowarehouse.eu. After moving to gamblingsites.co, the pages were indexed almost instantly. I kept a 301 in place until today as I had some links pointing to internetgamblingsites.net. Now, until a few weeks ago, everything was fine. The site was ranking top 10 for gambling sites (8-10) and I had some traffic everyday. This site wasn't my top priority, so besides adding new unique content, I didn't do much with it. In each case no shady link building or what-so-ever. On February first of this year, however, it lost all of its rankings, and I have no idea why. Much worse site appear in the top 50, where a sub page of my site appears somewhere on the 9th SERP for keyword 'gambling sites.' Last week I started contacting some people and asked them to update my links. I also used my own sites (all on unique hosting accounts) to build some branded links, i.e. 'GamblingSites.co' and similar terms to down tune the exact match. I also decreased the instances of the exact match on the homepage, to avoid over optimization. Finally, I removed the 301 from internetgamblingsites.net, since the better links have been changed (or are about to get changed soon.) Now, couple of days later... no changes, but it's probably to early to judge. My question to you: "What would you do next, to try to save the site and at least get some traffic to it?" Thank you for your help, Giorgio PS: Feel free to ask for more information.
Technical SEO | | VisualSense0 -
Can a site be removed from alexa?
let's say you have complete control over the webserver, and the hosting server. is there a way to set it up so that alexa statistics CANNOT be gained?
Technical SEO | | highersourcesites0 -
Subdirectories vs subdomains
Hi SEO gurus 🙂 Anyone has input on what's better? blog.domain.com vs domain.com/blog store.domain.com vs domain.com/store etc I think the subdir (/xyz) will concentrate authority on the same subdomain so should be better? However sometimes it is tidier on the server to maintain online stores or blogs in a separate strucutre so subdomains work better in that sense. I just want to make sure that doesn't affect SEO? Cheers!
Technical SEO | | hectorpn0