Blocking subdomains without blocking sites...
-
So let's say I am working for bloggingplatform.com, and people can create free sites through my tools and those sites show up as myblog.bloggingplatform.com. However that site can also be accessed from myblog.com.
Is there a way, separate from editing the myblog.com site code or files, for me to tell google to stop indexing myblog.bloggingplatform.com while still letting them index myblog.com without inserting any code into the page load?
This is a simplification of a problem I am running across.
Basically, Google is associating subdomains to my domain that it shouldn't even index, and it is adversely affecting my main domain. Other than contacting the offending sub-domain holders (which we do), I am looking for a way to stop Google from indexing those domains at all (they are used for technical purposes, and not for users to find the sites).
Thoughts?
-
Ah, I see now. Try this out http://moz.com/community/q/block-an-entire-subdomain-with-robots-txt#reply_26992 - basically, when a subdomain is identified, it would pull a different file into the robots.txt location (which would contain the disallow: / syntax)
Read the remaining comments about getting the subdomain removed via GWT.
-
You are correct, but that isn't what I was asking.
user1.bloggingplatform.com and myblog.com point to the same web server files. If I put up a robots.txt on user1.b... I would effectively de-index myblog.com.
The problem we have run accross is that user205.bloggingplatform.com might be doing something shady, but instead of de-listing the subdomain google kills the primary domain from the index as well.
Because user205.bloggingplatform.com should only be used for technical reasons, and not be in Google's index I am looking for a way to tell google not to index the sub-domain.
I think the better way to solve the problem would be to change the technical subdomain's domain though so change it from user205.bloggingplatform.com to user205.bloggingplatformtesting.com.
Then google can kill that URL all it wants as I don't care.
-
bloggingplatform.com/robots.txt
and
user1.bloggingplatform.com/robots.txt
can and should be different. If you disallow at the subdomain level, only the subdomain will be affected. You can search around for other examples of this but i'm certain it works (we have a development domain that is indexed and create subdomains for all clients that aren't indexed and done via individual robots.txt files)
-
I don't think that works. Since both URLs point to the same server the robots.txt file for the test URL would completely kill the main url.
Or am I missing something?
-
Each subdomain should have a robots.txt file that blocks that specific subdomain. e.g. user1.bloggingplatform.com/robots.txt should have:
User-agent: *
Disallow: /
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Site address change: new site isn't showing up in Google, old site is gone.
We just transitioned mccacompanies.com to confluentstrategies.com. The problem is that when I search for the old name, the old website doesn't come up anymore to redirect people to the new site. On the local card, Google has even taken off the website altogether. (I'm currently still trying to gain access to manage the business listing) When I search for confluent strategies, the website doesn't come up at all. But if I use the site: operator, it is in the index. Basically, my client has effectively disappeared off the face of the Google. (In doing other name changes, this has never happened to me before) What can I do?
Technical SEO | | MichaelGregory0 -
Mobile site not ranking
Hello, I have a m.site.com version of my original site. It is about 1/10 the size, and no matter what I do-I can't get the site to rank. I've added more pages and specified canonical etc etc. Should I add as many pages as my larger site has? Are there specific places I should be submitting this version beyond the typical? I am at a loss, so any help would be greatly appreciated! Thanks! L
Technical SEO | | lfrazer1 -
Block Domain in robots.txt
Hi. We had some URLs that were indexed in Google from a www1-subdomain. We have now disabled the URLs (returning a 404 - for other reasons we cannot do a redirect from www1 to www) and blocked via robots.txt. But the amount of indexed pages keeps increasing (for 2 weeks now). Unfortunately, I cannot install Webmaster Tools for this subdomain to tell Google to back off... Any ideas why this could be and whether it's normal? I can send you more domain infos by personal message if you want to have a look at it.
Technical SEO | | zeepartner0 -
Redirects in site map
I have a site with the ace/sef ( creates friendly URLS) in a large data base site. It creates a site map dynamically. Yet I realize one issue which I am trying to think through. I recently changed my urls to include an ID number example: homepage/houses/1134-big-blue-house The prior url was: homepage/houses/big-blue-house the original url above redirects to the new one with the ID like I want. However the site map has both URLS in it which go to same page I am not sure but it seems rather stupid to have the new URL and OLD redirected URL in the site map. Yet beside stupid I am wondering if this is duplicate content and will cause a penalty from the google bot. What is your opinion ?
Technical SEO | | aimiyo0 -
Block /tag/ or not?
I've asked this question in another area but now i want to ask it as a bigger question. Do we block /tag/ with robots.txt or not. Here's why I ask: My wordpress site does not block /tag/ and I have many /tag/ results in the top 10 results of Google. Have for months. The question is, does Google see /tag/ on WordPress as duplicate content? SEOMoz says it's duplicate content but it's a tag. It's not really content per say. I'm all for optimizing my site but Google is not penalizing me for /tag/ results. I don't want to block /tag/ if Google is not seeing it as duplicate content for only one reason and that's because I have many results in the top 10 on G. So, can someone who knows more about this weigh in on the subject for I really would like a accurate answer. Thanks in advance...
Technical SEO | | MyAllenMedia0 -
How can you get the right site links for your site?
Hello all, I have been trying to get Google to list relevant site links for my site when you type in our brand name, Loco2 or for when Loco2 comes up in a search result. Different things come up when you search Loco2 and Loco 2. We would like site links to look like how they do when you search Loco 2. However Loco2 is our brand name, NOT Loco 2. Does anyone know why Google is doing this and whether we can influence results? We have done as much as possible via Google webmaster, in terms of specifying the links we DO NOT want Google to list for Loco2. However, when you search "Loco2", results only show simple site links. Ideally what we want is: Loco2 to be recognised as the brand NOT Loco 2 The same results (substantial, identical) for Loco2 as for Loco 2 (think o2 and o 2) For the site links to reflect the main pages of our site (Times & Tickets, Engine Room forum etc.) Many thanks in advance! Anila
Technical SEO | | anilababla0 -
Site Hosting Question
We are UK based web designers who have recently been asked to build a website for an Australian Charity. Normally we would host the website in the UK with our current hosting company, but as this is an Australian website with an .au domain I was wondering if it would be better to host it in Australia. If it is better to host it in Australia, I would appreciate if someone could give me the name of a reasonably priced hosting company. Thanks Fraser
Technical SEO | | fraserhannah0